25
Logit models and logistic regressions for social networks: II. Multivariate relations Philippa Pattison* University of Melbourne, Australia Stanley Wasserman University of Illinois, USA The research described here builds on our previous work by generalizing the univariate models described there to models for multivariate relations. This family, labelled p*, generalizes the Markov random graphs of Frank and Strauss, which were further developed by them and others, building on Besag’s ideas on estima- tion. These models were rst used to model random variables embedded in lattices by Ising, and have been quite common in the study of spatial data. Here, they are applied to the statistical analysis of multigraphs, in general, and the analysis of multivariate social networks, in particular. In this paper, we show how to formulate models for multivariate social networks by considering a range of theoretical claims about social structure. We illustrate the models by developing structural models for several multivariate networks. 1. Introduction and background The goal of this paper is to extend the family of models termed p* presented in Strauss & Ikeda (1990) and Wasserman & Pattison (1996) to multivariate social relations. The p* family is a class of models for a single dichotomous social network relation, with parameters re ecting a wide variety of possible structural features. Yet social network relationships are often observed in multivariate form, being designed to re ect different qualities of social relations and their interrelationships. Indeed, the problem of characterizing the interdepen- dence of social ties of different types has a long theoretical history (Nadel, 1957; White, 1963; White, Boorman & Breiger, 1976; Boyd, 1991; Pattison, 1993). Here we present an extension of the p* family to multivariate social network data. We describe a general class of models that can be used to evaluate a wide range of hypotheses about the forms of structural interdependence in multiple relations. Two examples are presented to illustrate the models and the ways in which they can be used to investigate hypothesized interdependencies. Statistical models for multivariate networks are scarce. Frank & Nowicki (1993) intro- duced multivariate Bernoulli models for networks; see also Frank (1987, 1991, 1997). British Journal of Mathematical and Statistical Psychology (1999), 52, 169–193 Printed in Great Britain © 1999 The British Psychological Society 169 * Requests for reprints should be addressed to Dr Philippa Pattison, Department of Psychology, University of Melbourne, Parkville, Victoria 3052, Australia (e-mail: [email protected] nimelb.edu.au).

Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

  • Upload
    others

  • View
    10

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

Logit models and logistic regressions for socialnetworks II Multivariate relations

Philippa PattisonUniversity of Melbourne Australia

Stanley WassermanUniversity of Illinois USA

The research described here builds on our previous work by generalizing theunivariate models described there to models for multivariate relations This familylabelled p generalizes the Markov random graphs of Frank and Strauss whichwere further developed by them and others building on Besagrsquos ideas on estima-tion These models were rst used to model random variables embedded in latticesby Ising and have been quite common in the study of spatial data Here they areapplied to the statistical analysis of multigraphs in general and the analysis ofmultivariate social networks in particular In this paper we show how to formulatemodels for multivariate social networks by considering a range of theoretical claimsabout social structure We illustrate the models by developing structural models forseveral multivariate networks

1 Introduction and background

The goal of this paper is to extend the family of models termed p presented in Strauss ampIkeda (1990) and Wasserman amp Pattison (1996) to multivariate social relations The pfamily is a class of models for a single dichotomous social network relation with parametersre ecting a wide variety of possible structural features Yet social network relationships areoften observed in multivariate form being designed to re ect different qualities of socialrelations and their interrelationships Indeed the problem of characterizing the interdepen-dence of social ties of different types has a long theoretical history (Nadel 1957 White1963 White Boorman amp Breiger 1976 Boyd 1991 Pattison 1993)

Here we present an extension of the p family to multivariate social network data Wedescribe a general class of models that can be used to evaluate a wide range of hypothesesabout the forms of structural interdependence in multiple relations Two examples arepresented to illustrate the models and the ways in which they can be used to investigatehypothesized interdependencies

Statistical models for multivariate networks are scarce Frank amp Nowicki (1993) intro-duced multivariate Bernoulli models for networks see also Frank (1987 1991 1997)

British Journal of Mathematical and Statistical Psychology (1999) 52 169ndash 193 Printed in Great Britaincopy 1999 The British Psychological Society

169

Requests for reprints should be addressed to Dr Philippa Pattison Department of Psychology University ofMelbourne Parkville Victoria 3052 Australia (e-mail pattisionpsychu nimelbeduau)

Fienberg Meyer amp Wasserman (1985) Wasserman (1987) Iacobucci amp Wasserman (1987)and Iacobucci (1989) extended the p1 family to social networks in which more than onerelation is measured building on earlier work of Davis (1968) Galaskiewicz amp Marsden(1978) Fienberg amp Wasserman (1981) Holland amp Leinhardt (1981) and Fienberg Meyer ampWasserman (1981) All of these models assume dyadic independence an assumption that hassince come to be seen as unduly restrictive (see Wasserman amp Faust 1994 Chapter 15)Other interesting approaches to the statistical analysis of multiple relations include Katz ampPowell (1953) Hubert amp Baker (1978) Frank Lundquist Wellman amp Wilson (1986) andWellman Frank Espinoza Lundquist amp Wilson (1991) The scarcity of a literature onstatistical models for multivariate graphs (or multirelational networks) is underscored by thefact that only recently has there been any work on conditional uniform multigraphdistributions (Wasserman amp Pattison in press)

It would clearly be useful therefore to construct models for multivariate networks thatpossess a statistical basis but that do not make the implausible assumption of dyadicindependence The purpose of this paper is to describe such models These rst arose asmodels for lattice structures (Ising 1925) and have found much use in spatial applications(Besag 1975 1977a Wasserman 1978 Strauss 1992) They may also be seen as a specialcase of models described in the graphical modelling literature (for example Cox amp Wermuth1996 Edwards 1995 Lauritzen 1996 Whittaker 1990) Wasserman amp Pattison (1996)elaborated upon Frank amp Straussrsquos (1986) application of these models to social networks andutilized the standard pseudo-likelihood estimation approach to tting these models rstdescribed by Besag (1975 1977b) and applied to networks by Strauss amp Ikeda (1990)

After presenting notation we introduce the multivariate p model We describe a numberof particularly useful forms of the model and illustrate their application to two quite differentmultivariate networks

2 Some notation

We adhere to the notation presented in Wasserman amp Pattison (1996) A social network isde ned as a set of g social actors and a collection of r social relations that specify how theseactors are related to one another

We let N denote the set of actors N = 1 2 g and let Xm denote a social relation oftype m Xm is a set of ordered pairs recording the presence or absence of relational ties of typem between pairs of actors If the ordered pair (i j) is in this set then the rst actor (i) in thepair has a relational tie of type m to the second actor (j) in the pair We let R = 1 2 rdenote the set of relation types or labels

Any social relation Xm can be represented by a g 3 g matrix often referred to as asociomatrix Xm where

(Xm)ij =1 if (i j) [ Xm

0 otherwise

raquo

The converse of the relation Xm which we denote by X 9m is represented by

(X 9m)ij =

1 if (j i) [ Xm

0 otherwise

raquo

In the general multivariate case relational ties are recorded for r relations X1 X2 Xr

Philippa Pattison and Stanley Wasserman170

with associated (socio)matrices X1 X2 Xr We also consider the relations constructedfrom intersections and compositions of these r measured relations Formally the intersectionXk Ccedil Xh of relations Xk and Xh is given by the array Xk Ccedil Xh which has entries

(Xk Ccedil Xh)i j = 1 if (Xk)ij = 1 and (Xh)ij = 1

0 otherwise

raquo

The composition (or compound relation) XkXh of relations Xk and Xh is given by the arrayXkXh which has entries

(XkXh)ij =1 if (Xk)il = 1 and (Xh)lj = 1 for some l [ N

0 otherwise

raquo

Since we are concerned here with multivariate networks we also view the sequence ofmatrices (X1 X2 Xr) as de ning a three-way array X of size g 3 g 3 r with entriesXijm = (Xm)ij Since these (socio)matrices will be assumed to be random quantities we uselower-case bold-face characters (such as x) to denote realizations of the random quantities

To specify the multivariate p model we introduce several new arrays constructed from XFirst we de ne X+

ijm as the array formed from X where the tie from i to j of type m is forced tobe present

(X+ijm)nop = Xijm if (n o p) THORN (i j m)

1 if (n o p) = (i j m)

raquo

Thus X+ijm differs at most from X by the (i j m)th entry which is forced to be 1 Next we

de ne X2ijm as the array formed from X where the tie from i to j of type m is forced to be

absent

(X 2ijm)nop = Xijm if (n o p) THORN (i j m)

0 if (n o p) = (i j m)

raquo

We also de ne Xcijm as the matrix for the complement relation for X of the tie from i to j of

type m

(Xcijm)nop = Xijm if (n o p) THORN (i j m)

undefined if (n o p) = (i j m)

raquo

The complement relation has no relational tie of type m coded from i to j ndash one can view thissingle variable as missing

As in Wasserman amp Pattison (1996) we will let v represent logits ndash log-odds ratioscomparing the probability of one outcome of a random variable to the probability of anotheroutcome on a logarithm scale

3 Multivariate p

The original speci cation of the class of models p was just for a single dichotomous relationas described by Wasserman amp Pattison (1996) (see also Frank amp Strauss 1986 Rennolls1995 Strauss amp Ikeda 1990) Generalizations to more than one relation were mentioned inconcluding remarks by Frank amp Strauss (1986 Section 6) and by Strauss amp Ikeda (1990Section 5) and discussed in brief by Frank (1991 1997) and Frank amp Nowicki (1993)

Logit models and logistic regressions for social networks II 171

31 Theory

We rst present a generalization of the basic p model to mutivariate networks We de ne aset of random variables based on the relational ties in the network and then construct adependence graph for this situation The HammersleyndashClifford theorem (Besag 1974) positsa probability distribution for these random variables by using the postulated dependencegraph The exact form of the dependence graph depends on the nature of the substantivehypotheses about the social network under study we discuss such hypotheses at length

311 Probability models for multivariate directed random graphs

Any observed multivariate network may be regarded as a realization x = [xijm] of a randomthree-way binary array X = [Xijm] In general the entries of the array X cannot be assumed tobe independent consequently it is helpful to specify a dependence structure for the randomvariables Xijm as originally suggested by Frank amp Strauss (1986)

The dependence structure for these random variables is determined by the dependencegraph D of the random array X D is itself a graph whose nodes are elements of the index set(i j m) i j [ N i THORN j m [ R for the random variables in X and whose edges signify pairsof the random variables that are assumed to be conditionally dependent (given the values ofall other random variables) More formally a dependence graph for a multivariate socialnetwork has node set

N D = (i j m) i j [ N i THORN j m [ R

The edges of D are given by

ED = ((i j m) (k l h)) where Xijm and Xklh are conditionally dependent

The dependence graph is an example of what is termed an independence graph in thegraphical modelling literature (for example Lauritzen 1996) see Robins (1998) for anextended discussion of the application of graphical modelling techniques to social networkmodels

As Frank amp Strauss (1986) observed for univariate graphs and associated two-way binaryarrays several well-known classes of distributions for random graphs may be speci ed interms of the structure of the dependence graph For example the assumption of conditionalindependence for all pairs of random variables representing distinct relational ties (that isXijm and Xklh are independent whenever i THORN k andor j THORN l) leads to the class of Bernoullimultigraphs (see Frank amp Nowicki 1993 Wasserman amp Pattison in press) the assumption ofconditional dependence of Xijm and Xklh if and only if i j = k l leads to the class ofmultivariate dyad independence models (see Wasserman 1987 Wasserman amp Pattison inpress) The assumption of conditional independence of Xijm and Xklh if and only ifi j Ccedil k l = AElig gives rise to the class of multivariate Markov random graphs Ofcourse if the dependence graph is fully connected then a general class of random graphsis obtained

The HammersleyndashClifford theorem (Besag 1974) establishes that a probability model forX depends only on the complete subgraphs or cliques of the dependence graph D (A subsetA and N D is complete if every pair of nodes in A is linked by an edge of D A subsetcomprising a single node is also regarded as complete) In particular application of the

Philippa Pattison and Stanley Wasserman172

HammersleyndashClifford theorem yields a characterization of P(X = x) in the form of anexponential family of distributions

P(X = x) = 1k

sup3 acuteexp

X

A Iacute N D

lA

Y

(i jm)[A

xijm

Aacute

(1)

where k = Px exp

PAIacuteD lA

Q(ijm)[A xijm is a normalizing quantity D is the dependence

graph for X (the summation is over all subsets A of nodes of D)Q

(ijm)[A xijm is the suf cientstatistic corresponding to the parameter lA and lA = 0 whenever the subgraph induced bythe nodes in A is not a clique of D

The set of non-zero parameters in a model for P(X = x) is thus determined by thecollection of the maximal cliques of the dependence graph A maximal clique is a completesubgraph that is not properly contained in any other complete subgraph Note that anysubgraph of a complete subgraph is also complete so that if A is a maximal clique of D thenthere will be non-zero parameters for A and all of its subgraphs

312 Dependence structures for social networks

It is clear from model (1) (which we can refer to as the multivariate p distribution) that inorder to construct a probability model for a multivariate random array we need to specify anappropriate dependence structure We therefore consider some likely forms of dependenciesarising in multivariate arrays constructed from various types of social networks Theliterature on structural models for social networks contains a number of theoretical claimsabout the structural properties of networks that can be used to construct candidate dependencestructures

Multiplexity Interdependence of relations linking a pair of individuals The large literatureon role-sets (Merton 1957 Winship amp Mandel 1983 see also Chapter 12 of Wasserman ampFaust 1994) attests to the widespread belief that there is a likely dependence betweendifferent ties linking any given pair of individuals The essence of the claim is that thepresence of one type of tie between individuals is likely to affect the presence of other typesof tie and that over time distinctive role-sets comprising the relations linking a pair ofindividuals characterize the relationship from individual i to individual j Such multiplexinterdependencies lead to maximal cliques in the dependence graph of the form(i j 1) (i j 2) (i j r) if these are the only dependencies that are assumed the generalclass of Bernoulli multigraphs is obtained The parameters of the model have the form lAwhere A is any subset of the form i j m1) (i j m2) (i j mq) the correspondingsuf cient statistic in model (1) is

Qqh=1 xijmh

Of course a special case of the model isobtained if we assume complete independence of all observations The maximal cliques ofthe dependence graph are then of the form (i j m) with exactly one parameter correspond-ing to each random variable in the array If homogeneity is assumed for all node pairs (i j)then the model has just one parameter for each relation m (with suf cient statistic

Pij xijm if

homogeneity is imposed across all random variables in the multivariate network then themodel possesses a single parameter with suf cient statistic

Pijm xijm)

Exchange and reciprocity It has also been widely argued that a tie of one type from an

Logit models and logistic regressions for social networks II 173

individual i to another individual j may be conditionally dependent on ties from j to i of othertypes (see for example Parsons 1966 for a traditional appeal to mutually consistentexpectations through role norms and Leifer 1988 for an alternative and interesting dynamicaccount) If such conditional dependencies alone are assumed then maximal cliques of Dhave the form (i j m) (j i l) If these conditional dependencies are assumed as well as themultiplex conditional dependencies maximal cliques have the form

(i j 1) (i j 2) (i j r) (j i 1) (j i 2) (j i r)

In the latter case model (1) describes the multivariate dyad independence model termed themultivariate p1 model (Wasserman 1987)

Role interlocking path dependence A third type of argument has pointed to the potentialimportance of role interlocking in social networks (for example Boorman amp White 1976Boyd 1991 Lorrain amp White 1971 Pattison 1993 White 1977) It has been argued thatthe interrelationships among distinct types of ties can be represented by a partial orderingamong labelled paths in a social network where labelled paths systematically traceconnections among sequences of individuals (see Pattison 1993) More speci cally apath with the label mh links individual i to individual j if there is a tie of type m from i tosome intermediate individual l and a tie of type h from l to j (that is if Xilm = 1 and Xljh = 1)Longer paths are de ned recursively a tie of type mhn links individual i to individual j ifthere is some individual l such that i is linked to l by a path with the label mh and l is linked toj by a tie with the label n We refer to the path mhn as the concatenation of paths mh and nand note that concatenation is associative that is paths constructed as the concatenationof mh and n link precisely the same pairs of individuals as paths constructed by theconcatenation of m and hn Paths in networks have been claimed both to provide theessential framework for the ow of social processes as for example in the research onthe diffusion of innovations (see for example Coleman Katz amp Menzel 1966Michaelson 1990) and to give rise to some powerful anticipatory effects (see Lee 1969Mayer 1977) The most rudimentary form of dependence associated with social structuresconceived in this form involves conditional dependence between the variables Xilm andXljh The maximal cliques induced by such an assumption are cycles of length 2(i j m) (j i h) and cycles of length 3 (i j m) (j l h) (l i n) The resulting randomgraph model is new and is a special case of the multivariate Markov random graphsmentioned earlier Here we term the corresponding version of model (1) the path-dependentrandom multigraph model

Actor effects The fourth argument has arisen in the social cognition literature and positsactor attributes or biases associated with either the actor from whom the tie is directed(leading to a so-called row effect) or the actor to whom the tie is directed (a so-called columneffect) Row effects are associated with conditional dependencies of the form(i j m) (i k h) and give rise to maximal cliques in D of the form

(i 1 1) (i 1 2) (i 1 r) (i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Such dependencies are likely to be assumed when actor i is the source of information about allrelational ties emanating from actor i they are also necessarily imposed if constraints areplaced on the total number of ties directed from actor i (see Holland amp Leinhardt 1973)

Philippa Pattison and Stanley Wasserman174

Column effects are associated with conditional dependencies of the form (i j m) (k j h)and so with maximal cliques

(1 j 1) (1 j 2) (1 j r) (2 j 1) (2 j 2) (2 j r) (g j 1) (g j 2) (g j r)

Position effects and blockmodels A fth theme in the structural analysis of multiplenetworks is that distinctive patterns of inter-individual ties are associated with particularsocial positions Thus individuals occupying similar social positions may exhibit similarconditional dependencies among ties whereas those occupying distinct positions maypossess quite distinct inter-tie dependencies Thus knowledge of social position may beused as a basis for some hypothesized equations among the parameters referring to particularpatterns of conditional dependencies (determined by cliques in the dependence graph) theseissues are further discussed below

Interdependence of interlocking roles In addition several of these arguments may becombined For instance if we assume conditional dependencies associated with argumentsfor multiplexity reciprocity and exchange as well as role-interlocking effects then the classof Markov random multigraphs results its maximal cliques have the form of either amultivariate triad

(i j 1) (i j 2) (i j r) (j k 1) (j k 2) (j k r)

(k i 1) (k i 2) (k i r) (j i 1) (j i 2) (j i r)

(k j 1) (k j 2) (k j r) (i k 1) (i k 2) (i k r)

or a multivariate star

(1 i 1) (1 i 2) (1 i r) (2 i 1) (2 i 2) (2 i r)

(g i 1) (g i 2) (g i r) (i 1 1) (i 1 2) (i 1 r)

(i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Note that these three assumptions also entail actor effects hence we claim that the class ofMarkov random multigraphs is a quite plausible framework for the modelling of structure inmultiple social networks

313 Homogeneity constraints

For many of the speci c dependence graphs that we have discussed particularly for Markovrandom multigraphs model (1) may require the estimation of a large number of parameters Itis often useful therefore to introduce certain equality constraints among the parameters or toset certain parameters to zero One can de ne a class of homogeneous models for multivariatenetworks in which networks that are isomorphic under relabellings of the nodes areequiprobable

More generally we introduce an assumption that parameters corresponding to certainisomorphic congurations of nodes are equal We identify a random graph con guration witha subset A of N D and we call con gurations corresponding to A and B isomorphic if there is aone-to-one mapping w on the nodes in N such that (i j m) [ A if and only if

Logit models and logistic regressions for social networks II 175

(w (i) w (j) m) [ B for i j [ N m [ R If two con gurations A and B are isomorphic we setlA = lB and note that the suf cient statistic corresponding to lA becomes

PB

Q(ijm)[B xijm

where the summation is over all con gurations B isomorphic to AA more restricted form of parameter equating may also be useful when the nodes of the

random graph are hypothesized to fall into distinct classes or positions (as in an a prioriblockmodel see for example White et al 1976 Wasserman amp Anderson 1987Wasserman amp Faust 1994 Chapter 10) In this case the random graph nodes of thecon guration identi ed with the subset A may be regarded as coloured and two con g-urations A and B are de ned as isomorphic if there is a one-to-one mapping w on the nodesof N such that

1 (i j m) [ A if and only if (w (i) w ( j) m) [ B2 i and w (i) have the same colour3 j and w ( j) have the same colour

We then set lA = lB only if A and B are isomorphic (using this more restrictive de nition)

32 The multivariate p model

As mentioned we refer to equation (1) as the multivariate p model The parameters of themodel correspond to the cliques of the dependence graph D The suf cient statisticcorresponding to the parameter lA for clique A of D has the form

Q(ijm)[A xijm in the

case where homogeneity effects have been imposed the suf cient statistics are counts of suchvalues over cliques whose parameters are set to be equal

321 Introduction

The dependence structures for social networks described in the preceding section give rise to

Philippa Pattison and Stanley Wasserman176

Table 1 Some statistics and parameters for univariate and multivariate relations

Effect Parameter Graph statistic in lsquocount formrsquo

Single dichotom ous relationsChoice h fXMutuality r fXCcedilX 9

Transitivity t fXXCcedilXExpansiveness a i fXCcedilRi

Attractiveness b j fXCcedilCj

m-paths p m fXm

Subgroup density f [st] fXCcedildst

Subgroup mutuality r[st] fXCcedilX 9 Ccedildst

Subgroup transitivity t[sut] f(XCcedildsu)(XCcedildut)Ccedil(XCcedildst )

Multivariate relationsAssociation C fXCcedilYMultiplexity hkl fXkCcedilXl

Exchange rkl fXkCcedilX 9l

Generalized transitivity tklm f(XkXl )CcedilXm

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 2: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

Fienberg Meyer amp Wasserman (1985) Wasserman (1987) Iacobucci amp Wasserman (1987)and Iacobucci (1989) extended the p1 family to social networks in which more than onerelation is measured building on earlier work of Davis (1968) Galaskiewicz amp Marsden(1978) Fienberg amp Wasserman (1981) Holland amp Leinhardt (1981) and Fienberg Meyer ampWasserman (1981) All of these models assume dyadic independence an assumption that hassince come to be seen as unduly restrictive (see Wasserman amp Faust 1994 Chapter 15)Other interesting approaches to the statistical analysis of multiple relations include Katz ampPowell (1953) Hubert amp Baker (1978) Frank Lundquist Wellman amp Wilson (1986) andWellman Frank Espinoza Lundquist amp Wilson (1991) The scarcity of a literature onstatistical models for multivariate graphs (or multirelational networks) is underscored by thefact that only recently has there been any work on conditional uniform multigraphdistributions (Wasserman amp Pattison in press)

It would clearly be useful therefore to construct models for multivariate networks thatpossess a statistical basis but that do not make the implausible assumption of dyadicindependence The purpose of this paper is to describe such models These rst arose asmodels for lattice structures (Ising 1925) and have found much use in spatial applications(Besag 1975 1977a Wasserman 1978 Strauss 1992) They may also be seen as a specialcase of models described in the graphical modelling literature (for example Cox amp Wermuth1996 Edwards 1995 Lauritzen 1996 Whittaker 1990) Wasserman amp Pattison (1996)elaborated upon Frank amp Straussrsquos (1986) application of these models to social networks andutilized the standard pseudo-likelihood estimation approach to tting these models rstdescribed by Besag (1975 1977b) and applied to networks by Strauss amp Ikeda (1990)

After presenting notation we introduce the multivariate p model We describe a numberof particularly useful forms of the model and illustrate their application to two quite differentmultivariate networks

2 Some notation

We adhere to the notation presented in Wasserman amp Pattison (1996) A social network isde ned as a set of g social actors and a collection of r social relations that specify how theseactors are related to one another

We let N denote the set of actors N = 1 2 g and let Xm denote a social relation oftype m Xm is a set of ordered pairs recording the presence or absence of relational ties of typem between pairs of actors If the ordered pair (i j) is in this set then the rst actor (i) in thepair has a relational tie of type m to the second actor (j) in the pair We let R = 1 2 rdenote the set of relation types or labels

Any social relation Xm can be represented by a g 3 g matrix often referred to as asociomatrix Xm where

(Xm)ij =1 if (i j) [ Xm

0 otherwise

raquo

The converse of the relation Xm which we denote by X 9m is represented by

(X 9m)ij =

1 if (j i) [ Xm

0 otherwise

raquo

In the general multivariate case relational ties are recorded for r relations X1 X2 Xr

Philippa Pattison and Stanley Wasserman170

with associated (socio)matrices X1 X2 Xr We also consider the relations constructedfrom intersections and compositions of these r measured relations Formally the intersectionXk Ccedil Xh of relations Xk and Xh is given by the array Xk Ccedil Xh which has entries

(Xk Ccedil Xh)i j = 1 if (Xk)ij = 1 and (Xh)ij = 1

0 otherwise

raquo

The composition (or compound relation) XkXh of relations Xk and Xh is given by the arrayXkXh which has entries

(XkXh)ij =1 if (Xk)il = 1 and (Xh)lj = 1 for some l [ N

0 otherwise

raquo

Since we are concerned here with multivariate networks we also view the sequence ofmatrices (X1 X2 Xr) as de ning a three-way array X of size g 3 g 3 r with entriesXijm = (Xm)ij Since these (socio)matrices will be assumed to be random quantities we uselower-case bold-face characters (such as x) to denote realizations of the random quantities

To specify the multivariate p model we introduce several new arrays constructed from XFirst we de ne X+

ijm as the array formed from X where the tie from i to j of type m is forced tobe present

(X+ijm)nop = Xijm if (n o p) THORN (i j m)

1 if (n o p) = (i j m)

raquo

Thus X+ijm differs at most from X by the (i j m)th entry which is forced to be 1 Next we

de ne X2ijm as the array formed from X where the tie from i to j of type m is forced to be

absent

(X 2ijm)nop = Xijm if (n o p) THORN (i j m)

0 if (n o p) = (i j m)

raquo

We also de ne Xcijm as the matrix for the complement relation for X of the tie from i to j of

type m

(Xcijm)nop = Xijm if (n o p) THORN (i j m)

undefined if (n o p) = (i j m)

raquo

The complement relation has no relational tie of type m coded from i to j ndash one can view thissingle variable as missing

As in Wasserman amp Pattison (1996) we will let v represent logits ndash log-odds ratioscomparing the probability of one outcome of a random variable to the probability of anotheroutcome on a logarithm scale

3 Multivariate p

The original speci cation of the class of models p was just for a single dichotomous relationas described by Wasserman amp Pattison (1996) (see also Frank amp Strauss 1986 Rennolls1995 Strauss amp Ikeda 1990) Generalizations to more than one relation were mentioned inconcluding remarks by Frank amp Strauss (1986 Section 6) and by Strauss amp Ikeda (1990Section 5) and discussed in brief by Frank (1991 1997) and Frank amp Nowicki (1993)

Logit models and logistic regressions for social networks II 171

31 Theory

We rst present a generalization of the basic p model to mutivariate networks We de ne aset of random variables based on the relational ties in the network and then construct adependence graph for this situation The HammersleyndashClifford theorem (Besag 1974) positsa probability distribution for these random variables by using the postulated dependencegraph The exact form of the dependence graph depends on the nature of the substantivehypotheses about the social network under study we discuss such hypotheses at length

311 Probability models for multivariate directed random graphs

Any observed multivariate network may be regarded as a realization x = [xijm] of a randomthree-way binary array X = [Xijm] In general the entries of the array X cannot be assumed tobe independent consequently it is helpful to specify a dependence structure for the randomvariables Xijm as originally suggested by Frank amp Strauss (1986)

The dependence structure for these random variables is determined by the dependencegraph D of the random array X D is itself a graph whose nodes are elements of the index set(i j m) i j [ N i THORN j m [ R for the random variables in X and whose edges signify pairsof the random variables that are assumed to be conditionally dependent (given the values ofall other random variables) More formally a dependence graph for a multivariate socialnetwork has node set

N D = (i j m) i j [ N i THORN j m [ R

The edges of D are given by

ED = ((i j m) (k l h)) where Xijm and Xklh are conditionally dependent

The dependence graph is an example of what is termed an independence graph in thegraphical modelling literature (for example Lauritzen 1996) see Robins (1998) for anextended discussion of the application of graphical modelling techniques to social networkmodels

As Frank amp Strauss (1986) observed for univariate graphs and associated two-way binaryarrays several well-known classes of distributions for random graphs may be speci ed interms of the structure of the dependence graph For example the assumption of conditionalindependence for all pairs of random variables representing distinct relational ties (that isXijm and Xklh are independent whenever i THORN k andor j THORN l) leads to the class of Bernoullimultigraphs (see Frank amp Nowicki 1993 Wasserman amp Pattison in press) the assumption ofconditional dependence of Xijm and Xklh if and only if i j = k l leads to the class ofmultivariate dyad independence models (see Wasserman 1987 Wasserman amp Pattison inpress) The assumption of conditional independence of Xijm and Xklh if and only ifi j Ccedil k l = AElig gives rise to the class of multivariate Markov random graphs Ofcourse if the dependence graph is fully connected then a general class of random graphsis obtained

The HammersleyndashClifford theorem (Besag 1974) establishes that a probability model forX depends only on the complete subgraphs or cliques of the dependence graph D (A subsetA and N D is complete if every pair of nodes in A is linked by an edge of D A subsetcomprising a single node is also regarded as complete) In particular application of the

Philippa Pattison and Stanley Wasserman172

HammersleyndashClifford theorem yields a characterization of P(X = x) in the form of anexponential family of distributions

P(X = x) = 1k

sup3 acuteexp

X

A Iacute N D

lA

Y

(i jm)[A

xijm

Aacute

(1)

where k = Px exp

PAIacuteD lA

Q(ijm)[A xijm is a normalizing quantity D is the dependence

graph for X (the summation is over all subsets A of nodes of D)Q

(ijm)[A xijm is the suf cientstatistic corresponding to the parameter lA and lA = 0 whenever the subgraph induced bythe nodes in A is not a clique of D

The set of non-zero parameters in a model for P(X = x) is thus determined by thecollection of the maximal cliques of the dependence graph A maximal clique is a completesubgraph that is not properly contained in any other complete subgraph Note that anysubgraph of a complete subgraph is also complete so that if A is a maximal clique of D thenthere will be non-zero parameters for A and all of its subgraphs

312 Dependence structures for social networks

It is clear from model (1) (which we can refer to as the multivariate p distribution) that inorder to construct a probability model for a multivariate random array we need to specify anappropriate dependence structure We therefore consider some likely forms of dependenciesarising in multivariate arrays constructed from various types of social networks Theliterature on structural models for social networks contains a number of theoretical claimsabout the structural properties of networks that can be used to construct candidate dependencestructures

Multiplexity Interdependence of relations linking a pair of individuals The large literatureon role-sets (Merton 1957 Winship amp Mandel 1983 see also Chapter 12 of Wasserman ampFaust 1994) attests to the widespread belief that there is a likely dependence betweendifferent ties linking any given pair of individuals The essence of the claim is that thepresence of one type of tie between individuals is likely to affect the presence of other typesof tie and that over time distinctive role-sets comprising the relations linking a pair ofindividuals characterize the relationship from individual i to individual j Such multiplexinterdependencies lead to maximal cliques in the dependence graph of the form(i j 1) (i j 2) (i j r) if these are the only dependencies that are assumed the generalclass of Bernoulli multigraphs is obtained The parameters of the model have the form lAwhere A is any subset of the form i j m1) (i j m2) (i j mq) the correspondingsuf cient statistic in model (1) is

Qqh=1 xijmh

Of course a special case of the model isobtained if we assume complete independence of all observations The maximal cliques ofthe dependence graph are then of the form (i j m) with exactly one parameter correspond-ing to each random variable in the array If homogeneity is assumed for all node pairs (i j)then the model has just one parameter for each relation m (with suf cient statistic

Pij xijm if

homogeneity is imposed across all random variables in the multivariate network then themodel possesses a single parameter with suf cient statistic

Pijm xijm)

Exchange and reciprocity It has also been widely argued that a tie of one type from an

Logit models and logistic regressions for social networks II 173

individual i to another individual j may be conditionally dependent on ties from j to i of othertypes (see for example Parsons 1966 for a traditional appeal to mutually consistentexpectations through role norms and Leifer 1988 for an alternative and interesting dynamicaccount) If such conditional dependencies alone are assumed then maximal cliques of Dhave the form (i j m) (j i l) If these conditional dependencies are assumed as well as themultiplex conditional dependencies maximal cliques have the form

(i j 1) (i j 2) (i j r) (j i 1) (j i 2) (j i r)

In the latter case model (1) describes the multivariate dyad independence model termed themultivariate p1 model (Wasserman 1987)

Role interlocking path dependence A third type of argument has pointed to the potentialimportance of role interlocking in social networks (for example Boorman amp White 1976Boyd 1991 Lorrain amp White 1971 Pattison 1993 White 1977) It has been argued thatthe interrelationships among distinct types of ties can be represented by a partial orderingamong labelled paths in a social network where labelled paths systematically traceconnections among sequences of individuals (see Pattison 1993) More speci cally apath with the label mh links individual i to individual j if there is a tie of type m from i tosome intermediate individual l and a tie of type h from l to j (that is if Xilm = 1 and Xljh = 1)Longer paths are de ned recursively a tie of type mhn links individual i to individual j ifthere is some individual l such that i is linked to l by a path with the label mh and l is linked toj by a tie with the label n We refer to the path mhn as the concatenation of paths mh and nand note that concatenation is associative that is paths constructed as the concatenationof mh and n link precisely the same pairs of individuals as paths constructed by theconcatenation of m and hn Paths in networks have been claimed both to provide theessential framework for the ow of social processes as for example in the research onthe diffusion of innovations (see for example Coleman Katz amp Menzel 1966Michaelson 1990) and to give rise to some powerful anticipatory effects (see Lee 1969Mayer 1977) The most rudimentary form of dependence associated with social structuresconceived in this form involves conditional dependence between the variables Xilm andXljh The maximal cliques induced by such an assumption are cycles of length 2(i j m) (j i h) and cycles of length 3 (i j m) (j l h) (l i n) The resulting randomgraph model is new and is a special case of the multivariate Markov random graphsmentioned earlier Here we term the corresponding version of model (1) the path-dependentrandom multigraph model

Actor effects The fourth argument has arisen in the social cognition literature and positsactor attributes or biases associated with either the actor from whom the tie is directed(leading to a so-called row effect) or the actor to whom the tie is directed (a so-called columneffect) Row effects are associated with conditional dependencies of the form(i j m) (i k h) and give rise to maximal cliques in D of the form

(i 1 1) (i 1 2) (i 1 r) (i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Such dependencies are likely to be assumed when actor i is the source of information about allrelational ties emanating from actor i they are also necessarily imposed if constraints areplaced on the total number of ties directed from actor i (see Holland amp Leinhardt 1973)

Philippa Pattison and Stanley Wasserman174

Column effects are associated with conditional dependencies of the form (i j m) (k j h)and so with maximal cliques

(1 j 1) (1 j 2) (1 j r) (2 j 1) (2 j 2) (2 j r) (g j 1) (g j 2) (g j r)

Position effects and blockmodels A fth theme in the structural analysis of multiplenetworks is that distinctive patterns of inter-individual ties are associated with particularsocial positions Thus individuals occupying similar social positions may exhibit similarconditional dependencies among ties whereas those occupying distinct positions maypossess quite distinct inter-tie dependencies Thus knowledge of social position may beused as a basis for some hypothesized equations among the parameters referring to particularpatterns of conditional dependencies (determined by cliques in the dependence graph) theseissues are further discussed below

Interdependence of interlocking roles In addition several of these arguments may becombined For instance if we assume conditional dependencies associated with argumentsfor multiplexity reciprocity and exchange as well as role-interlocking effects then the classof Markov random multigraphs results its maximal cliques have the form of either amultivariate triad

(i j 1) (i j 2) (i j r) (j k 1) (j k 2) (j k r)

(k i 1) (k i 2) (k i r) (j i 1) (j i 2) (j i r)

(k j 1) (k j 2) (k j r) (i k 1) (i k 2) (i k r)

or a multivariate star

(1 i 1) (1 i 2) (1 i r) (2 i 1) (2 i 2) (2 i r)

(g i 1) (g i 2) (g i r) (i 1 1) (i 1 2) (i 1 r)

(i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Note that these three assumptions also entail actor effects hence we claim that the class ofMarkov random multigraphs is a quite plausible framework for the modelling of structure inmultiple social networks

313 Homogeneity constraints

For many of the speci c dependence graphs that we have discussed particularly for Markovrandom multigraphs model (1) may require the estimation of a large number of parameters Itis often useful therefore to introduce certain equality constraints among the parameters or toset certain parameters to zero One can de ne a class of homogeneous models for multivariatenetworks in which networks that are isomorphic under relabellings of the nodes areequiprobable

More generally we introduce an assumption that parameters corresponding to certainisomorphic congurations of nodes are equal We identify a random graph con guration witha subset A of N D and we call con gurations corresponding to A and B isomorphic if there is aone-to-one mapping w on the nodes in N such that (i j m) [ A if and only if

Logit models and logistic regressions for social networks II 175

(w (i) w (j) m) [ B for i j [ N m [ R If two con gurations A and B are isomorphic we setlA = lB and note that the suf cient statistic corresponding to lA becomes

PB

Q(ijm)[B xijm

where the summation is over all con gurations B isomorphic to AA more restricted form of parameter equating may also be useful when the nodes of the

random graph are hypothesized to fall into distinct classes or positions (as in an a prioriblockmodel see for example White et al 1976 Wasserman amp Anderson 1987Wasserman amp Faust 1994 Chapter 10) In this case the random graph nodes of thecon guration identi ed with the subset A may be regarded as coloured and two con g-urations A and B are de ned as isomorphic if there is a one-to-one mapping w on the nodesof N such that

1 (i j m) [ A if and only if (w (i) w ( j) m) [ B2 i and w (i) have the same colour3 j and w ( j) have the same colour

We then set lA = lB only if A and B are isomorphic (using this more restrictive de nition)

32 The multivariate p model

As mentioned we refer to equation (1) as the multivariate p model The parameters of themodel correspond to the cliques of the dependence graph D The suf cient statisticcorresponding to the parameter lA for clique A of D has the form

Q(ijm)[A xijm in the

case where homogeneity effects have been imposed the suf cient statistics are counts of suchvalues over cliques whose parameters are set to be equal

321 Introduction

The dependence structures for social networks described in the preceding section give rise to

Philippa Pattison and Stanley Wasserman176

Table 1 Some statistics and parameters for univariate and multivariate relations

Effect Parameter Graph statistic in lsquocount formrsquo

Single dichotom ous relationsChoice h fXMutuality r fXCcedilX 9

Transitivity t fXXCcedilXExpansiveness a i fXCcedilRi

Attractiveness b j fXCcedilCj

m-paths p m fXm

Subgroup density f [st] fXCcedildst

Subgroup mutuality r[st] fXCcedilX 9 Ccedildst

Subgroup transitivity t[sut] f(XCcedildsu)(XCcedildut)Ccedil(XCcedildst )

Multivariate relationsAssociation C fXCcedilYMultiplexity hkl fXkCcedilXl

Exchange rkl fXkCcedilX 9l

Generalized transitivity tklm f(XkXl )CcedilXm

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 3: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

with associated (socio)matrices X1 X2 Xr We also consider the relations constructedfrom intersections and compositions of these r measured relations Formally the intersectionXk Ccedil Xh of relations Xk and Xh is given by the array Xk Ccedil Xh which has entries

(Xk Ccedil Xh)i j = 1 if (Xk)ij = 1 and (Xh)ij = 1

0 otherwise

raquo

The composition (or compound relation) XkXh of relations Xk and Xh is given by the arrayXkXh which has entries

(XkXh)ij =1 if (Xk)il = 1 and (Xh)lj = 1 for some l [ N

0 otherwise

raquo

Since we are concerned here with multivariate networks we also view the sequence ofmatrices (X1 X2 Xr) as de ning a three-way array X of size g 3 g 3 r with entriesXijm = (Xm)ij Since these (socio)matrices will be assumed to be random quantities we uselower-case bold-face characters (such as x) to denote realizations of the random quantities

To specify the multivariate p model we introduce several new arrays constructed from XFirst we de ne X+

ijm as the array formed from X where the tie from i to j of type m is forced tobe present

(X+ijm)nop = Xijm if (n o p) THORN (i j m)

1 if (n o p) = (i j m)

raquo

Thus X+ijm differs at most from X by the (i j m)th entry which is forced to be 1 Next we

de ne X2ijm as the array formed from X where the tie from i to j of type m is forced to be

absent

(X 2ijm)nop = Xijm if (n o p) THORN (i j m)

0 if (n o p) = (i j m)

raquo

We also de ne Xcijm as the matrix for the complement relation for X of the tie from i to j of

type m

(Xcijm)nop = Xijm if (n o p) THORN (i j m)

undefined if (n o p) = (i j m)

raquo

The complement relation has no relational tie of type m coded from i to j ndash one can view thissingle variable as missing

As in Wasserman amp Pattison (1996) we will let v represent logits ndash log-odds ratioscomparing the probability of one outcome of a random variable to the probability of anotheroutcome on a logarithm scale

3 Multivariate p

The original speci cation of the class of models p was just for a single dichotomous relationas described by Wasserman amp Pattison (1996) (see also Frank amp Strauss 1986 Rennolls1995 Strauss amp Ikeda 1990) Generalizations to more than one relation were mentioned inconcluding remarks by Frank amp Strauss (1986 Section 6) and by Strauss amp Ikeda (1990Section 5) and discussed in brief by Frank (1991 1997) and Frank amp Nowicki (1993)

Logit models and logistic regressions for social networks II 171

31 Theory

We rst present a generalization of the basic p model to mutivariate networks We de ne aset of random variables based on the relational ties in the network and then construct adependence graph for this situation The HammersleyndashClifford theorem (Besag 1974) positsa probability distribution for these random variables by using the postulated dependencegraph The exact form of the dependence graph depends on the nature of the substantivehypotheses about the social network under study we discuss such hypotheses at length

311 Probability models for multivariate directed random graphs

Any observed multivariate network may be regarded as a realization x = [xijm] of a randomthree-way binary array X = [Xijm] In general the entries of the array X cannot be assumed tobe independent consequently it is helpful to specify a dependence structure for the randomvariables Xijm as originally suggested by Frank amp Strauss (1986)

The dependence structure for these random variables is determined by the dependencegraph D of the random array X D is itself a graph whose nodes are elements of the index set(i j m) i j [ N i THORN j m [ R for the random variables in X and whose edges signify pairsof the random variables that are assumed to be conditionally dependent (given the values ofall other random variables) More formally a dependence graph for a multivariate socialnetwork has node set

N D = (i j m) i j [ N i THORN j m [ R

The edges of D are given by

ED = ((i j m) (k l h)) where Xijm and Xklh are conditionally dependent

The dependence graph is an example of what is termed an independence graph in thegraphical modelling literature (for example Lauritzen 1996) see Robins (1998) for anextended discussion of the application of graphical modelling techniques to social networkmodels

As Frank amp Strauss (1986) observed for univariate graphs and associated two-way binaryarrays several well-known classes of distributions for random graphs may be speci ed interms of the structure of the dependence graph For example the assumption of conditionalindependence for all pairs of random variables representing distinct relational ties (that isXijm and Xklh are independent whenever i THORN k andor j THORN l) leads to the class of Bernoullimultigraphs (see Frank amp Nowicki 1993 Wasserman amp Pattison in press) the assumption ofconditional dependence of Xijm and Xklh if and only if i j = k l leads to the class ofmultivariate dyad independence models (see Wasserman 1987 Wasserman amp Pattison inpress) The assumption of conditional independence of Xijm and Xklh if and only ifi j Ccedil k l = AElig gives rise to the class of multivariate Markov random graphs Ofcourse if the dependence graph is fully connected then a general class of random graphsis obtained

The HammersleyndashClifford theorem (Besag 1974) establishes that a probability model forX depends only on the complete subgraphs or cliques of the dependence graph D (A subsetA and N D is complete if every pair of nodes in A is linked by an edge of D A subsetcomprising a single node is also regarded as complete) In particular application of the

Philippa Pattison and Stanley Wasserman172

HammersleyndashClifford theorem yields a characterization of P(X = x) in the form of anexponential family of distributions

P(X = x) = 1k

sup3 acuteexp

X

A Iacute N D

lA

Y

(i jm)[A

xijm

Aacute

(1)

where k = Px exp

PAIacuteD lA

Q(ijm)[A xijm is a normalizing quantity D is the dependence

graph for X (the summation is over all subsets A of nodes of D)Q

(ijm)[A xijm is the suf cientstatistic corresponding to the parameter lA and lA = 0 whenever the subgraph induced bythe nodes in A is not a clique of D

The set of non-zero parameters in a model for P(X = x) is thus determined by thecollection of the maximal cliques of the dependence graph A maximal clique is a completesubgraph that is not properly contained in any other complete subgraph Note that anysubgraph of a complete subgraph is also complete so that if A is a maximal clique of D thenthere will be non-zero parameters for A and all of its subgraphs

312 Dependence structures for social networks

It is clear from model (1) (which we can refer to as the multivariate p distribution) that inorder to construct a probability model for a multivariate random array we need to specify anappropriate dependence structure We therefore consider some likely forms of dependenciesarising in multivariate arrays constructed from various types of social networks Theliterature on structural models for social networks contains a number of theoretical claimsabout the structural properties of networks that can be used to construct candidate dependencestructures

Multiplexity Interdependence of relations linking a pair of individuals The large literatureon role-sets (Merton 1957 Winship amp Mandel 1983 see also Chapter 12 of Wasserman ampFaust 1994) attests to the widespread belief that there is a likely dependence betweendifferent ties linking any given pair of individuals The essence of the claim is that thepresence of one type of tie between individuals is likely to affect the presence of other typesof tie and that over time distinctive role-sets comprising the relations linking a pair ofindividuals characterize the relationship from individual i to individual j Such multiplexinterdependencies lead to maximal cliques in the dependence graph of the form(i j 1) (i j 2) (i j r) if these are the only dependencies that are assumed the generalclass of Bernoulli multigraphs is obtained The parameters of the model have the form lAwhere A is any subset of the form i j m1) (i j m2) (i j mq) the correspondingsuf cient statistic in model (1) is

Qqh=1 xijmh

Of course a special case of the model isobtained if we assume complete independence of all observations The maximal cliques ofthe dependence graph are then of the form (i j m) with exactly one parameter correspond-ing to each random variable in the array If homogeneity is assumed for all node pairs (i j)then the model has just one parameter for each relation m (with suf cient statistic

Pij xijm if

homogeneity is imposed across all random variables in the multivariate network then themodel possesses a single parameter with suf cient statistic

Pijm xijm)

Exchange and reciprocity It has also been widely argued that a tie of one type from an

Logit models and logistic regressions for social networks II 173

individual i to another individual j may be conditionally dependent on ties from j to i of othertypes (see for example Parsons 1966 for a traditional appeal to mutually consistentexpectations through role norms and Leifer 1988 for an alternative and interesting dynamicaccount) If such conditional dependencies alone are assumed then maximal cliques of Dhave the form (i j m) (j i l) If these conditional dependencies are assumed as well as themultiplex conditional dependencies maximal cliques have the form

(i j 1) (i j 2) (i j r) (j i 1) (j i 2) (j i r)

In the latter case model (1) describes the multivariate dyad independence model termed themultivariate p1 model (Wasserman 1987)

Role interlocking path dependence A third type of argument has pointed to the potentialimportance of role interlocking in social networks (for example Boorman amp White 1976Boyd 1991 Lorrain amp White 1971 Pattison 1993 White 1977) It has been argued thatthe interrelationships among distinct types of ties can be represented by a partial orderingamong labelled paths in a social network where labelled paths systematically traceconnections among sequences of individuals (see Pattison 1993) More speci cally apath with the label mh links individual i to individual j if there is a tie of type m from i tosome intermediate individual l and a tie of type h from l to j (that is if Xilm = 1 and Xljh = 1)Longer paths are de ned recursively a tie of type mhn links individual i to individual j ifthere is some individual l such that i is linked to l by a path with the label mh and l is linked toj by a tie with the label n We refer to the path mhn as the concatenation of paths mh and nand note that concatenation is associative that is paths constructed as the concatenationof mh and n link precisely the same pairs of individuals as paths constructed by theconcatenation of m and hn Paths in networks have been claimed both to provide theessential framework for the ow of social processes as for example in the research onthe diffusion of innovations (see for example Coleman Katz amp Menzel 1966Michaelson 1990) and to give rise to some powerful anticipatory effects (see Lee 1969Mayer 1977) The most rudimentary form of dependence associated with social structuresconceived in this form involves conditional dependence between the variables Xilm andXljh The maximal cliques induced by such an assumption are cycles of length 2(i j m) (j i h) and cycles of length 3 (i j m) (j l h) (l i n) The resulting randomgraph model is new and is a special case of the multivariate Markov random graphsmentioned earlier Here we term the corresponding version of model (1) the path-dependentrandom multigraph model

Actor effects The fourth argument has arisen in the social cognition literature and positsactor attributes or biases associated with either the actor from whom the tie is directed(leading to a so-called row effect) or the actor to whom the tie is directed (a so-called columneffect) Row effects are associated with conditional dependencies of the form(i j m) (i k h) and give rise to maximal cliques in D of the form

(i 1 1) (i 1 2) (i 1 r) (i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Such dependencies are likely to be assumed when actor i is the source of information about allrelational ties emanating from actor i they are also necessarily imposed if constraints areplaced on the total number of ties directed from actor i (see Holland amp Leinhardt 1973)

Philippa Pattison and Stanley Wasserman174

Column effects are associated with conditional dependencies of the form (i j m) (k j h)and so with maximal cliques

(1 j 1) (1 j 2) (1 j r) (2 j 1) (2 j 2) (2 j r) (g j 1) (g j 2) (g j r)

Position effects and blockmodels A fth theme in the structural analysis of multiplenetworks is that distinctive patterns of inter-individual ties are associated with particularsocial positions Thus individuals occupying similar social positions may exhibit similarconditional dependencies among ties whereas those occupying distinct positions maypossess quite distinct inter-tie dependencies Thus knowledge of social position may beused as a basis for some hypothesized equations among the parameters referring to particularpatterns of conditional dependencies (determined by cliques in the dependence graph) theseissues are further discussed below

Interdependence of interlocking roles In addition several of these arguments may becombined For instance if we assume conditional dependencies associated with argumentsfor multiplexity reciprocity and exchange as well as role-interlocking effects then the classof Markov random multigraphs results its maximal cliques have the form of either amultivariate triad

(i j 1) (i j 2) (i j r) (j k 1) (j k 2) (j k r)

(k i 1) (k i 2) (k i r) (j i 1) (j i 2) (j i r)

(k j 1) (k j 2) (k j r) (i k 1) (i k 2) (i k r)

or a multivariate star

(1 i 1) (1 i 2) (1 i r) (2 i 1) (2 i 2) (2 i r)

(g i 1) (g i 2) (g i r) (i 1 1) (i 1 2) (i 1 r)

(i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Note that these three assumptions also entail actor effects hence we claim that the class ofMarkov random multigraphs is a quite plausible framework for the modelling of structure inmultiple social networks

313 Homogeneity constraints

For many of the speci c dependence graphs that we have discussed particularly for Markovrandom multigraphs model (1) may require the estimation of a large number of parameters Itis often useful therefore to introduce certain equality constraints among the parameters or toset certain parameters to zero One can de ne a class of homogeneous models for multivariatenetworks in which networks that are isomorphic under relabellings of the nodes areequiprobable

More generally we introduce an assumption that parameters corresponding to certainisomorphic congurations of nodes are equal We identify a random graph con guration witha subset A of N D and we call con gurations corresponding to A and B isomorphic if there is aone-to-one mapping w on the nodes in N such that (i j m) [ A if and only if

Logit models and logistic regressions for social networks II 175

(w (i) w (j) m) [ B for i j [ N m [ R If two con gurations A and B are isomorphic we setlA = lB and note that the suf cient statistic corresponding to lA becomes

PB

Q(ijm)[B xijm

where the summation is over all con gurations B isomorphic to AA more restricted form of parameter equating may also be useful when the nodes of the

random graph are hypothesized to fall into distinct classes or positions (as in an a prioriblockmodel see for example White et al 1976 Wasserman amp Anderson 1987Wasserman amp Faust 1994 Chapter 10) In this case the random graph nodes of thecon guration identi ed with the subset A may be regarded as coloured and two con g-urations A and B are de ned as isomorphic if there is a one-to-one mapping w on the nodesof N such that

1 (i j m) [ A if and only if (w (i) w ( j) m) [ B2 i and w (i) have the same colour3 j and w ( j) have the same colour

We then set lA = lB only if A and B are isomorphic (using this more restrictive de nition)

32 The multivariate p model

As mentioned we refer to equation (1) as the multivariate p model The parameters of themodel correspond to the cliques of the dependence graph D The suf cient statisticcorresponding to the parameter lA for clique A of D has the form

Q(ijm)[A xijm in the

case where homogeneity effects have been imposed the suf cient statistics are counts of suchvalues over cliques whose parameters are set to be equal

321 Introduction

The dependence structures for social networks described in the preceding section give rise to

Philippa Pattison and Stanley Wasserman176

Table 1 Some statistics and parameters for univariate and multivariate relations

Effect Parameter Graph statistic in lsquocount formrsquo

Single dichotom ous relationsChoice h fXMutuality r fXCcedilX 9

Transitivity t fXXCcedilXExpansiveness a i fXCcedilRi

Attractiveness b j fXCcedilCj

m-paths p m fXm

Subgroup density f [st] fXCcedildst

Subgroup mutuality r[st] fXCcedilX 9 Ccedildst

Subgroup transitivity t[sut] f(XCcedildsu)(XCcedildut)Ccedil(XCcedildst )

Multivariate relationsAssociation C fXCcedilYMultiplexity hkl fXkCcedilXl

Exchange rkl fXkCcedilX 9l

Generalized transitivity tklm f(XkXl )CcedilXm

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 4: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

31 Theory

We rst present a generalization of the basic p model to mutivariate networks We de ne aset of random variables based on the relational ties in the network and then construct adependence graph for this situation The HammersleyndashClifford theorem (Besag 1974) positsa probability distribution for these random variables by using the postulated dependencegraph The exact form of the dependence graph depends on the nature of the substantivehypotheses about the social network under study we discuss such hypotheses at length

311 Probability models for multivariate directed random graphs

Any observed multivariate network may be regarded as a realization x = [xijm] of a randomthree-way binary array X = [Xijm] In general the entries of the array X cannot be assumed tobe independent consequently it is helpful to specify a dependence structure for the randomvariables Xijm as originally suggested by Frank amp Strauss (1986)

The dependence structure for these random variables is determined by the dependencegraph D of the random array X D is itself a graph whose nodes are elements of the index set(i j m) i j [ N i THORN j m [ R for the random variables in X and whose edges signify pairsof the random variables that are assumed to be conditionally dependent (given the values ofall other random variables) More formally a dependence graph for a multivariate socialnetwork has node set

N D = (i j m) i j [ N i THORN j m [ R

The edges of D are given by

ED = ((i j m) (k l h)) where Xijm and Xklh are conditionally dependent

The dependence graph is an example of what is termed an independence graph in thegraphical modelling literature (for example Lauritzen 1996) see Robins (1998) for anextended discussion of the application of graphical modelling techniques to social networkmodels

As Frank amp Strauss (1986) observed for univariate graphs and associated two-way binaryarrays several well-known classes of distributions for random graphs may be speci ed interms of the structure of the dependence graph For example the assumption of conditionalindependence for all pairs of random variables representing distinct relational ties (that isXijm and Xklh are independent whenever i THORN k andor j THORN l) leads to the class of Bernoullimultigraphs (see Frank amp Nowicki 1993 Wasserman amp Pattison in press) the assumption ofconditional dependence of Xijm and Xklh if and only if i j = k l leads to the class ofmultivariate dyad independence models (see Wasserman 1987 Wasserman amp Pattison inpress) The assumption of conditional independence of Xijm and Xklh if and only ifi j Ccedil k l = AElig gives rise to the class of multivariate Markov random graphs Ofcourse if the dependence graph is fully connected then a general class of random graphsis obtained

The HammersleyndashClifford theorem (Besag 1974) establishes that a probability model forX depends only on the complete subgraphs or cliques of the dependence graph D (A subsetA and N D is complete if every pair of nodes in A is linked by an edge of D A subsetcomprising a single node is also regarded as complete) In particular application of the

Philippa Pattison and Stanley Wasserman172

HammersleyndashClifford theorem yields a characterization of P(X = x) in the form of anexponential family of distributions

P(X = x) = 1k

sup3 acuteexp

X

A Iacute N D

lA

Y

(i jm)[A

xijm

Aacute

(1)

where k = Px exp

PAIacuteD lA

Q(ijm)[A xijm is a normalizing quantity D is the dependence

graph for X (the summation is over all subsets A of nodes of D)Q

(ijm)[A xijm is the suf cientstatistic corresponding to the parameter lA and lA = 0 whenever the subgraph induced bythe nodes in A is not a clique of D

The set of non-zero parameters in a model for P(X = x) is thus determined by thecollection of the maximal cliques of the dependence graph A maximal clique is a completesubgraph that is not properly contained in any other complete subgraph Note that anysubgraph of a complete subgraph is also complete so that if A is a maximal clique of D thenthere will be non-zero parameters for A and all of its subgraphs

312 Dependence structures for social networks

It is clear from model (1) (which we can refer to as the multivariate p distribution) that inorder to construct a probability model for a multivariate random array we need to specify anappropriate dependence structure We therefore consider some likely forms of dependenciesarising in multivariate arrays constructed from various types of social networks Theliterature on structural models for social networks contains a number of theoretical claimsabout the structural properties of networks that can be used to construct candidate dependencestructures

Multiplexity Interdependence of relations linking a pair of individuals The large literatureon role-sets (Merton 1957 Winship amp Mandel 1983 see also Chapter 12 of Wasserman ampFaust 1994) attests to the widespread belief that there is a likely dependence betweendifferent ties linking any given pair of individuals The essence of the claim is that thepresence of one type of tie between individuals is likely to affect the presence of other typesof tie and that over time distinctive role-sets comprising the relations linking a pair ofindividuals characterize the relationship from individual i to individual j Such multiplexinterdependencies lead to maximal cliques in the dependence graph of the form(i j 1) (i j 2) (i j r) if these are the only dependencies that are assumed the generalclass of Bernoulli multigraphs is obtained The parameters of the model have the form lAwhere A is any subset of the form i j m1) (i j m2) (i j mq) the correspondingsuf cient statistic in model (1) is

Qqh=1 xijmh

Of course a special case of the model isobtained if we assume complete independence of all observations The maximal cliques ofthe dependence graph are then of the form (i j m) with exactly one parameter correspond-ing to each random variable in the array If homogeneity is assumed for all node pairs (i j)then the model has just one parameter for each relation m (with suf cient statistic

Pij xijm if

homogeneity is imposed across all random variables in the multivariate network then themodel possesses a single parameter with suf cient statistic

Pijm xijm)

Exchange and reciprocity It has also been widely argued that a tie of one type from an

Logit models and logistic regressions for social networks II 173

individual i to another individual j may be conditionally dependent on ties from j to i of othertypes (see for example Parsons 1966 for a traditional appeal to mutually consistentexpectations through role norms and Leifer 1988 for an alternative and interesting dynamicaccount) If such conditional dependencies alone are assumed then maximal cliques of Dhave the form (i j m) (j i l) If these conditional dependencies are assumed as well as themultiplex conditional dependencies maximal cliques have the form

(i j 1) (i j 2) (i j r) (j i 1) (j i 2) (j i r)

In the latter case model (1) describes the multivariate dyad independence model termed themultivariate p1 model (Wasserman 1987)

Role interlocking path dependence A third type of argument has pointed to the potentialimportance of role interlocking in social networks (for example Boorman amp White 1976Boyd 1991 Lorrain amp White 1971 Pattison 1993 White 1977) It has been argued thatthe interrelationships among distinct types of ties can be represented by a partial orderingamong labelled paths in a social network where labelled paths systematically traceconnections among sequences of individuals (see Pattison 1993) More speci cally apath with the label mh links individual i to individual j if there is a tie of type m from i tosome intermediate individual l and a tie of type h from l to j (that is if Xilm = 1 and Xljh = 1)Longer paths are de ned recursively a tie of type mhn links individual i to individual j ifthere is some individual l such that i is linked to l by a path with the label mh and l is linked toj by a tie with the label n We refer to the path mhn as the concatenation of paths mh and nand note that concatenation is associative that is paths constructed as the concatenationof mh and n link precisely the same pairs of individuals as paths constructed by theconcatenation of m and hn Paths in networks have been claimed both to provide theessential framework for the ow of social processes as for example in the research onthe diffusion of innovations (see for example Coleman Katz amp Menzel 1966Michaelson 1990) and to give rise to some powerful anticipatory effects (see Lee 1969Mayer 1977) The most rudimentary form of dependence associated with social structuresconceived in this form involves conditional dependence between the variables Xilm andXljh The maximal cliques induced by such an assumption are cycles of length 2(i j m) (j i h) and cycles of length 3 (i j m) (j l h) (l i n) The resulting randomgraph model is new and is a special case of the multivariate Markov random graphsmentioned earlier Here we term the corresponding version of model (1) the path-dependentrandom multigraph model

Actor effects The fourth argument has arisen in the social cognition literature and positsactor attributes or biases associated with either the actor from whom the tie is directed(leading to a so-called row effect) or the actor to whom the tie is directed (a so-called columneffect) Row effects are associated with conditional dependencies of the form(i j m) (i k h) and give rise to maximal cliques in D of the form

(i 1 1) (i 1 2) (i 1 r) (i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Such dependencies are likely to be assumed when actor i is the source of information about allrelational ties emanating from actor i they are also necessarily imposed if constraints areplaced on the total number of ties directed from actor i (see Holland amp Leinhardt 1973)

Philippa Pattison and Stanley Wasserman174

Column effects are associated with conditional dependencies of the form (i j m) (k j h)and so with maximal cliques

(1 j 1) (1 j 2) (1 j r) (2 j 1) (2 j 2) (2 j r) (g j 1) (g j 2) (g j r)

Position effects and blockmodels A fth theme in the structural analysis of multiplenetworks is that distinctive patterns of inter-individual ties are associated with particularsocial positions Thus individuals occupying similar social positions may exhibit similarconditional dependencies among ties whereas those occupying distinct positions maypossess quite distinct inter-tie dependencies Thus knowledge of social position may beused as a basis for some hypothesized equations among the parameters referring to particularpatterns of conditional dependencies (determined by cliques in the dependence graph) theseissues are further discussed below

Interdependence of interlocking roles In addition several of these arguments may becombined For instance if we assume conditional dependencies associated with argumentsfor multiplexity reciprocity and exchange as well as role-interlocking effects then the classof Markov random multigraphs results its maximal cliques have the form of either amultivariate triad

(i j 1) (i j 2) (i j r) (j k 1) (j k 2) (j k r)

(k i 1) (k i 2) (k i r) (j i 1) (j i 2) (j i r)

(k j 1) (k j 2) (k j r) (i k 1) (i k 2) (i k r)

or a multivariate star

(1 i 1) (1 i 2) (1 i r) (2 i 1) (2 i 2) (2 i r)

(g i 1) (g i 2) (g i r) (i 1 1) (i 1 2) (i 1 r)

(i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Note that these three assumptions also entail actor effects hence we claim that the class ofMarkov random multigraphs is a quite plausible framework for the modelling of structure inmultiple social networks

313 Homogeneity constraints

For many of the speci c dependence graphs that we have discussed particularly for Markovrandom multigraphs model (1) may require the estimation of a large number of parameters Itis often useful therefore to introduce certain equality constraints among the parameters or toset certain parameters to zero One can de ne a class of homogeneous models for multivariatenetworks in which networks that are isomorphic under relabellings of the nodes areequiprobable

More generally we introduce an assumption that parameters corresponding to certainisomorphic congurations of nodes are equal We identify a random graph con guration witha subset A of N D and we call con gurations corresponding to A and B isomorphic if there is aone-to-one mapping w on the nodes in N such that (i j m) [ A if and only if

Logit models and logistic regressions for social networks II 175

(w (i) w (j) m) [ B for i j [ N m [ R If two con gurations A and B are isomorphic we setlA = lB and note that the suf cient statistic corresponding to lA becomes

PB

Q(ijm)[B xijm

where the summation is over all con gurations B isomorphic to AA more restricted form of parameter equating may also be useful when the nodes of the

random graph are hypothesized to fall into distinct classes or positions (as in an a prioriblockmodel see for example White et al 1976 Wasserman amp Anderson 1987Wasserman amp Faust 1994 Chapter 10) In this case the random graph nodes of thecon guration identi ed with the subset A may be regarded as coloured and two con g-urations A and B are de ned as isomorphic if there is a one-to-one mapping w on the nodesof N such that

1 (i j m) [ A if and only if (w (i) w ( j) m) [ B2 i and w (i) have the same colour3 j and w ( j) have the same colour

We then set lA = lB only if A and B are isomorphic (using this more restrictive de nition)

32 The multivariate p model

As mentioned we refer to equation (1) as the multivariate p model The parameters of themodel correspond to the cliques of the dependence graph D The suf cient statisticcorresponding to the parameter lA for clique A of D has the form

Q(ijm)[A xijm in the

case where homogeneity effects have been imposed the suf cient statistics are counts of suchvalues over cliques whose parameters are set to be equal

321 Introduction

The dependence structures for social networks described in the preceding section give rise to

Philippa Pattison and Stanley Wasserman176

Table 1 Some statistics and parameters for univariate and multivariate relations

Effect Parameter Graph statistic in lsquocount formrsquo

Single dichotom ous relationsChoice h fXMutuality r fXCcedilX 9

Transitivity t fXXCcedilXExpansiveness a i fXCcedilRi

Attractiveness b j fXCcedilCj

m-paths p m fXm

Subgroup density f [st] fXCcedildst

Subgroup mutuality r[st] fXCcedilX 9 Ccedildst

Subgroup transitivity t[sut] f(XCcedildsu)(XCcedildut)Ccedil(XCcedildst )

Multivariate relationsAssociation C fXCcedilYMultiplexity hkl fXkCcedilXl

Exchange rkl fXkCcedilX 9l

Generalized transitivity tklm f(XkXl )CcedilXm

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 5: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

HammersleyndashClifford theorem yields a characterization of P(X = x) in the form of anexponential family of distributions

P(X = x) = 1k

sup3 acuteexp

X

A Iacute N D

lA

Y

(i jm)[A

xijm

Aacute

(1)

where k = Px exp

PAIacuteD lA

Q(ijm)[A xijm is a normalizing quantity D is the dependence

graph for X (the summation is over all subsets A of nodes of D)Q

(ijm)[A xijm is the suf cientstatistic corresponding to the parameter lA and lA = 0 whenever the subgraph induced bythe nodes in A is not a clique of D

The set of non-zero parameters in a model for P(X = x) is thus determined by thecollection of the maximal cliques of the dependence graph A maximal clique is a completesubgraph that is not properly contained in any other complete subgraph Note that anysubgraph of a complete subgraph is also complete so that if A is a maximal clique of D thenthere will be non-zero parameters for A and all of its subgraphs

312 Dependence structures for social networks

It is clear from model (1) (which we can refer to as the multivariate p distribution) that inorder to construct a probability model for a multivariate random array we need to specify anappropriate dependence structure We therefore consider some likely forms of dependenciesarising in multivariate arrays constructed from various types of social networks Theliterature on structural models for social networks contains a number of theoretical claimsabout the structural properties of networks that can be used to construct candidate dependencestructures

Multiplexity Interdependence of relations linking a pair of individuals The large literatureon role-sets (Merton 1957 Winship amp Mandel 1983 see also Chapter 12 of Wasserman ampFaust 1994) attests to the widespread belief that there is a likely dependence betweendifferent ties linking any given pair of individuals The essence of the claim is that thepresence of one type of tie between individuals is likely to affect the presence of other typesof tie and that over time distinctive role-sets comprising the relations linking a pair ofindividuals characterize the relationship from individual i to individual j Such multiplexinterdependencies lead to maximal cliques in the dependence graph of the form(i j 1) (i j 2) (i j r) if these are the only dependencies that are assumed the generalclass of Bernoulli multigraphs is obtained The parameters of the model have the form lAwhere A is any subset of the form i j m1) (i j m2) (i j mq) the correspondingsuf cient statistic in model (1) is

Qqh=1 xijmh

Of course a special case of the model isobtained if we assume complete independence of all observations The maximal cliques ofthe dependence graph are then of the form (i j m) with exactly one parameter correspond-ing to each random variable in the array If homogeneity is assumed for all node pairs (i j)then the model has just one parameter for each relation m (with suf cient statistic

Pij xijm if

homogeneity is imposed across all random variables in the multivariate network then themodel possesses a single parameter with suf cient statistic

Pijm xijm)

Exchange and reciprocity It has also been widely argued that a tie of one type from an

Logit models and logistic regressions for social networks II 173

individual i to another individual j may be conditionally dependent on ties from j to i of othertypes (see for example Parsons 1966 for a traditional appeal to mutually consistentexpectations through role norms and Leifer 1988 for an alternative and interesting dynamicaccount) If such conditional dependencies alone are assumed then maximal cliques of Dhave the form (i j m) (j i l) If these conditional dependencies are assumed as well as themultiplex conditional dependencies maximal cliques have the form

(i j 1) (i j 2) (i j r) (j i 1) (j i 2) (j i r)

In the latter case model (1) describes the multivariate dyad independence model termed themultivariate p1 model (Wasserman 1987)

Role interlocking path dependence A third type of argument has pointed to the potentialimportance of role interlocking in social networks (for example Boorman amp White 1976Boyd 1991 Lorrain amp White 1971 Pattison 1993 White 1977) It has been argued thatthe interrelationships among distinct types of ties can be represented by a partial orderingamong labelled paths in a social network where labelled paths systematically traceconnections among sequences of individuals (see Pattison 1993) More speci cally apath with the label mh links individual i to individual j if there is a tie of type m from i tosome intermediate individual l and a tie of type h from l to j (that is if Xilm = 1 and Xljh = 1)Longer paths are de ned recursively a tie of type mhn links individual i to individual j ifthere is some individual l such that i is linked to l by a path with the label mh and l is linked toj by a tie with the label n We refer to the path mhn as the concatenation of paths mh and nand note that concatenation is associative that is paths constructed as the concatenationof mh and n link precisely the same pairs of individuals as paths constructed by theconcatenation of m and hn Paths in networks have been claimed both to provide theessential framework for the ow of social processes as for example in the research onthe diffusion of innovations (see for example Coleman Katz amp Menzel 1966Michaelson 1990) and to give rise to some powerful anticipatory effects (see Lee 1969Mayer 1977) The most rudimentary form of dependence associated with social structuresconceived in this form involves conditional dependence between the variables Xilm andXljh The maximal cliques induced by such an assumption are cycles of length 2(i j m) (j i h) and cycles of length 3 (i j m) (j l h) (l i n) The resulting randomgraph model is new and is a special case of the multivariate Markov random graphsmentioned earlier Here we term the corresponding version of model (1) the path-dependentrandom multigraph model

Actor effects The fourth argument has arisen in the social cognition literature and positsactor attributes or biases associated with either the actor from whom the tie is directed(leading to a so-called row effect) or the actor to whom the tie is directed (a so-called columneffect) Row effects are associated with conditional dependencies of the form(i j m) (i k h) and give rise to maximal cliques in D of the form

(i 1 1) (i 1 2) (i 1 r) (i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Such dependencies are likely to be assumed when actor i is the source of information about allrelational ties emanating from actor i they are also necessarily imposed if constraints areplaced on the total number of ties directed from actor i (see Holland amp Leinhardt 1973)

Philippa Pattison and Stanley Wasserman174

Column effects are associated with conditional dependencies of the form (i j m) (k j h)and so with maximal cliques

(1 j 1) (1 j 2) (1 j r) (2 j 1) (2 j 2) (2 j r) (g j 1) (g j 2) (g j r)

Position effects and blockmodels A fth theme in the structural analysis of multiplenetworks is that distinctive patterns of inter-individual ties are associated with particularsocial positions Thus individuals occupying similar social positions may exhibit similarconditional dependencies among ties whereas those occupying distinct positions maypossess quite distinct inter-tie dependencies Thus knowledge of social position may beused as a basis for some hypothesized equations among the parameters referring to particularpatterns of conditional dependencies (determined by cliques in the dependence graph) theseissues are further discussed below

Interdependence of interlocking roles In addition several of these arguments may becombined For instance if we assume conditional dependencies associated with argumentsfor multiplexity reciprocity and exchange as well as role-interlocking effects then the classof Markov random multigraphs results its maximal cliques have the form of either amultivariate triad

(i j 1) (i j 2) (i j r) (j k 1) (j k 2) (j k r)

(k i 1) (k i 2) (k i r) (j i 1) (j i 2) (j i r)

(k j 1) (k j 2) (k j r) (i k 1) (i k 2) (i k r)

or a multivariate star

(1 i 1) (1 i 2) (1 i r) (2 i 1) (2 i 2) (2 i r)

(g i 1) (g i 2) (g i r) (i 1 1) (i 1 2) (i 1 r)

(i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Note that these three assumptions also entail actor effects hence we claim that the class ofMarkov random multigraphs is a quite plausible framework for the modelling of structure inmultiple social networks

313 Homogeneity constraints

For many of the speci c dependence graphs that we have discussed particularly for Markovrandom multigraphs model (1) may require the estimation of a large number of parameters Itis often useful therefore to introduce certain equality constraints among the parameters or toset certain parameters to zero One can de ne a class of homogeneous models for multivariatenetworks in which networks that are isomorphic under relabellings of the nodes areequiprobable

More generally we introduce an assumption that parameters corresponding to certainisomorphic congurations of nodes are equal We identify a random graph con guration witha subset A of N D and we call con gurations corresponding to A and B isomorphic if there is aone-to-one mapping w on the nodes in N such that (i j m) [ A if and only if

Logit models and logistic regressions for social networks II 175

(w (i) w (j) m) [ B for i j [ N m [ R If two con gurations A and B are isomorphic we setlA = lB and note that the suf cient statistic corresponding to lA becomes

PB

Q(ijm)[B xijm

where the summation is over all con gurations B isomorphic to AA more restricted form of parameter equating may also be useful when the nodes of the

random graph are hypothesized to fall into distinct classes or positions (as in an a prioriblockmodel see for example White et al 1976 Wasserman amp Anderson 1987Wasserman amp Faust 1994 Chapter 10) In this case the random graph nodes of thecon guration identi ed with the subset A may be regarded as coloured and two con g-urations A and B are de ned as isomorphic if there is a one-to-one mapping w on the nodesof N such that

1 (i j m) [ A if and only if (w (i) w ( j) m) [ B2 i and w (i) have the same colour3 j and w ( j) have the same colour

We then set lA = lB only if A and B are isomorphic (using this more restrictive de nition)

32 The multivariate p model

As mentioned we refer to equation (1) as the multivariate p model The parameters of themodel correspond to the cliques of the dependence graph D The suf cient statisticcorresponding to the parameter lA for clique A of D has the form

Q(ijm)[A xijm in the

case where homogeneity effects have been imposed the suf cient statistics are counts of suchvalues over cliques whose parameters are set to be equal

321 Introduction

The dependence structures for social networks described in the preceding section give rise to

Philippa Pattison and Stanley Wasserman176

Table 1 Some statistics and parameters for univariate and multivariate relations

Effect Parameter Graph statistic in lsquocount formrsquo

Single dichotom ous relationsChoice h fXMutuality r fXCcedilX 9

Transitivity t fXXCcedilXExpansiveness a i fXCcedilRi

Attractiveness b j fXCcedilCj

m-paths p m fXm

Subgroup density f [st] fXCcedildst

Subgroup mutuality r[st] fXCcedilX 9 Ccedildst

Subgroup transitivity t[sut] f(XCcedildsu)(XCcedildut)Ccedil(XCcedildst )

Multivariate relationsAssociation C fXCcedilYMultiplexity hkl fXkCcedilXl

Exchange rkl fXkCcedilX 9l

Generalized transitivity tklm f(XkXl )CcedilXm

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 6: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

individual i to another individual j may be conditionally dependent on ties from j to i of othertypes (see for example Parsons 1966 for a traditional appeal to mutually consistentexpectations through role norms and Leifer 1988 for an alternative and interesting dynamicaccount) If such conditional dependencies alone are assumed then maximal cliques of Dhave the form (i j m) (j i l) If these conditional dependencies are assumed as well as themultiplex conditional dependencies maximal cliques have the form

(i j 1) (i j 2) (i j r) (j i 1) (j i 2) (j i r)

In the latter case model (1) describes the multivariate dyad independence model termed themultivariate p1 model (Wasserman 1987)

Role interlocking path dependence A third type of argument has pointed to the potentialimportance of role interlocking in social networks (for example Boorman amp White 1976Boyd 1991 Lorrain amp White 1971 Pattison 1993 White 1977) It has been argued thatthe interrelationships among distinct types of ties can be represented by a partial orderingamong labelled paths in a social network where labelled paths systematically traceconnections among sequences of individuals (see Pattison 1993) More speci cally apath with the label mh links individual i to individual j if there is a tie of type m from i tosome intermediate individual l and a tie of type h from l to j (that is if Xilm = 1 and Xljh = 1)Longer paths are de ned recursively a tie of type mhn links individual i to individual j ifthere is some individual l such that i is linked to l by a path with the label mh and l is linked toj by a tie with the label n We refer to the path mhn as the concatenation of paths mh and nand note that concatenation is associative that is paths constructed as the concatenationof mh and n link precisely the same pairs of individuals as paths constructed by theconcatenation of m and hn Paths in networks have been claimed both to provide theessential framework for the ow of social processes as for example in the research onthe diffusion of innovations (see for example Coleman Katz amp Menzel 1966Michaelson 1990) and to give rise to some powerful anticipatory effects (see Lee 1969Mayer 1977) The most rudimentary form of dependence associated with social structuresconceived in this form involves conditional dependence between the variables Xilm andXljh The maximal cliques induced by such an assumption are cycles of length 2(i j m) (j i h) and cycles of length 3 (i j m) (j l h) (l i n) The resulting randomgraph model is new and is a special case of the multivariate Markov random graphsmentioned earlier Here we term the corresponding version of model (1) the path-dependentrandom multigraph model

Actor effects The fourth argument has arisen in the social cognition literature and positsactor attributes or biases associated with either the actor from whom the tie is directed(leading to a so-called row effect) or the actor to whom the tie is directed (a so-called columneffect) Row effects are associated with conditional dependencies of the form(i j m) (i k h) and give rise to maximal cliques in D of the form

(i 1 1) (i 1 2) (i 1 r) (i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Such dependencies are likely to be assumed when actor i is the source of information about allrelational ties emanating from actor i they are also necessarily imposed if constraints areplaced on the total number of ties directed from actor i (see Holland amp Leinhardt 1973)

Philippa Pattison and Stanley Wasserman174

Column effects are associated with conditional dependencies of the form (i j m) (k j h)and so with maximal cliques

(1 j 1) (1 j 2) (1 j r) (2 j 1) (2 j 2) (2 j r) (g j 1) (g j 2) (g j r)

Position effects and blockmodels A fth theme in the structural analysis of multiplenetworks is that distinctive patterns of inter-individual ties are associated with particularsocial positions Thus individuals occupying similar social positions may exhibit similarconditional dependencies among ties whereas those occupying distinct positions maypossess quite distinct inter-tie dependencies Thus knowledge of social position may beused as a basis for some hypothesized equations among the parameters referring to particularpatterns of conditional dependencies (determined by cliques in the dependence graph) theseissues are further discussed below

Interdependence of interlocking roles In addition several of these arguments may becombined For instance if we assume conditional dependencies associated with argumentsfor multiplexity reciprocity and exchange as well as role-interlocking effects then the classof Markov random multigraphs results its maximal cliques have the form of either amultivariate triad

(i j 1) (i j 2) (i j r) (j k 1) (j k 2) (j k r)

(k i 1) (k i 2) (k i r) (j i 1) (j i 2) (j i r)

(k j 1) (k j 2) (k j r) (i k 1) (i k 2) (i k r)

or a multivariate star

(1 i 1) (1 i 2) (1 i r) (2 i 1) (2 i 2) (2 i r)

(g i 1) (g i 2) (g i r) (i 1 1) (i 1 2) (i 1 r)

(i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Note that these three assumptions also entail actor effects hence we claim that the class ofMarkov random multigraphs is a quite plausible framework for the modelling of structure inmultiple social networks

313 Homogeneity constraints

For many of the speci c dependence graphs that we have discussed particularly for Markovrandom multigraphs model (1) may require the estimation of a large number of parameters Itis often useful therefore to introduce certain equality constraints among the parameters or toset certain parameters to zero One can de ne a class of homogeneous models for multivariatenetworks in which networks that are isomorphic under relabellings of the nodes areequiprobable

More generally we introduce an assumption that parameters corresponding to certainisomorphic congurations of nodes are equal We identify a random graph con guration witha subset A of N D and we call con gurations corresponding to A and B isomorphic if there is aone-to-one mapping w on the nodes in N such that (i j m) [ A if and only if

Logit models and logistic regressions for social networks II 175

(w (i) w (j) m) [ B for i j [ N m [ R If two con gurations A and B are isomorphic we setlA = lB and note that the suf cient statistic corresponding to lA becomes

PB

Q(ijm)[B xijm

where the summation is over all con gurations B isomorphic to AA more restricted form of parameter equating may also be useful when the nodes of the

random graph are hypothesized to fall into distinct classes or positions (as in an a prioriblockmodel see for example White et al 1976 Wasserman amp Anderson 1987Wasserman amp Faust 1994 Chapter 10) In this case the random graph nodes of thecon guration identi ed with the subset A may be regarded as coloured and two con g-urations A and B are de ned as isomorphic if there is a one-to-one mapping w on the nodesof N such that

1 (i j m) [ A if and only if (w (i) w ( j) m) [ B2 i and w (i) have the same colour3 j and w ( j) have the same colour

We then set lA = lB only if A and B are isomorphic (using this more restrictive de nition)

32 The multivariate p model

As mentioned we refer to equation (1) as the multivariate p model The parameters of themodel correspond to the cliques of the dependence graph D The suf cient statisticcorresponding to the parameter lA for clique A of D has the form

Q(ijm)[A xijm in the

case where homogeneity effects have been imposed the suf cient statistics are counts of suchvalues over cliques whose parameters are set to be equal

321 Introduction

The dependence structures for social networks described in the preceding section give rise to

Philippa Pattison and Stanley Wasserman176

Table 1 Some statistics and parameters for univariate and multivariate relations

Effect Parameter Graph statistic in lsquocount formrsquo

Single dichotom ous relationsChoice h fXMutuality r fXCcedilX 9

Transitivity t fXXCcedilXExpansiveness a i fXCcedilRi

Attractiveness b j fXCcedilCj

m-paths p m fXm

Subgroup density f [st] fXCcedildst

Subgroup mutuality r[st] fXCcedilX 9 Ccedildst

Subgroup transitivity t[sut] f(XCcedildsu)(XCcedildut)Ccedil(XCcedildst )

Multivariate relationsAssociation C fXCcedilYMultiplexity hkl fXkCcedilXl

Exchange rkl fXkCcedilX 9l

Generalized transitivity tklm f(XkXl )CcedilXm

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 7: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

Column effects are associated with conditional dependencies of the form (i j m) (k j h)and so with maximal cliques

(1 j 1) (1 j 2) (1 j r) (2 j 1) (2 j 2) (2 j r) (g j 1) (g j 2) (g j r)

Position effects and blockmodels A fth theme in the structural analysis of multiplenetworks is that distinctive patterns of inter-individual ties are associated with particularsocial positions Thus individuals occupying similar social positions may exhibit similarconditional dependencies among ties whereas those occupying distinct positions maypossess quite distinct inter-tie dependencies Thus knowledge of social position may beused as a basis for some hypothesized equations among the parameters referring to particularpatterns of conditional dependencies (determined by cliques in the dependence graph) theseissues are further discussed below

Interdependence of interlocking roles In addition several of these arguments may becombined For instance if we assume conditional dependencies associated with argumentsfor multiplexity reciprocity and exchange as well as role-interlocking effects then the classof Markov random multigraphs results its maximal cliques have the form of either amultivariate triad

(i j 1) (i j 2) (i j r) (j k 1) (j k 2) (j k r)

(k i 1) (k i 2) (k i r) (j i 1) (j i 2) (j i r)

(k j 1) (k j 2) (k j r) (i k 1) (i k 2) (i k r)

or a multivariate star

(1 i 1) (1 i 2) (1 i r) (2 i 1) (2 i 2) (2 i r)

(g i 1) (g i 2) (g i r) (i 1 1) (i 1 2) (i 1 r)

(i 2 1) (i 2 2) (i 2 r) (i g 1) (i g 2) (i g r)

Note that these three assumptions also entail actor effects hence we claim that the class ofMarkov random multigraphs is a quite plausible framework for the modelling of structure inmultiple social networks

313 Homogeneity constraints

For many of the speci c dependence graphs that we have discussed particularly for Markovrandom multigraphs model (1) may require the estimation of a large number of parameters Itis often useful therefore to introduce certain equality constraints among the parameters or toset certain parameters to zero One can de ne a class of homogeneous models for multivariatenetworks in which networks that are isomorphic under relabellings of the nodes areequiprobable

More generally we introduce an assumption that parameters corresponding to certainisomorphic congurations of nodes are equal We identify a random graph con guration witha subset A of N D and we call con gurations corresponding to A and B isomorphic if there is aone-to-one mapping w on the nodes in N such that (i j m) [ A if and only if

Logit models and logistic regressions for social networks II 175

(w (i) w (j) m) [ B for i j [ N m [ R If two con gurations A and B are isomorphic we setlA = lB and note that the suf cient statistic corresponding to lA becomes

PB

Q(ijm)[B xijm

where the summation is over all con gurations B isomorphic to AA more restricted form of parameter equating may also be useful when the nodes of the

random graph are hypothesized to fall into distinct classes or positions (as in an a prioriblockmodel see for example White et al 1976 Wasserman amp Anderson 1987Wasserman amp Faust 1994 Chapter 10) In this case the random graph nodes of thecon guration identi ed with the subset A may be regarded as coloured and two con g-urations A and B are de ned as isomorphic if there is a one-to-one mapping w on the nodesof N such that

1 (i j m) [ A if and only if (w (i) w ( j) m) [ B2 i and w (i) have the same colour3 j and w ( j) have the same colour

We then set lA = lB only if A and B are isomorphic (using this more restrictive de nition)

32 The multivariate p model

As mentioned we refer to equation (1) as the multivariate p model The parameters of themodel correspond to the cliques of the dependence graph D The suf cient statisticcorresponding to the parameter lA for clique A of D has the form

Q(ijm)[A xijm in the

case where homogeneity effects have been imposed the suf cient statistics are counts of suchvalues over cliques whose parameters are set to be equal

321 Introduction

The dependence structures for social networks described in the preceding section give rise to

Philippa Pattison and Stanley Wasserman176

Table 1 Some statistics and parameters for univariate and multivariate relations

Effect Parameter Graph statistic in lsquocount formrsquo

Single dichotom ous relationsChoice h fXMutuality r fXCcedilX 9

Transitivity t fXXCcedilXExpansiveness a i fXCcedilRi

Attractiveness b j fXCcedilCj

m-paths p m fXm

Subgroup density f [st] fXCcedildst

Subgroup mutuality r[st] fXCcedilX 9 Ccedildst

Subgroup transitivity t[sut] f(XCcedildsu)(XCcedildut)Ccedil(XCcedildst )

Multivariate relationsAssociation C fXCcedilYMultiplexity hkl fXkCcedilXl

Exchange rkl fXkCcedilX 9l

Generalized transitivity tklm f(XkXl )CcedilXm

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 8: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

(w (i) w (j) m) [ B for i j [ N m [ R If two con gurations A and B are isomorphic we setlA = lB and note that the suf cient statistic corresponding to lA becomes

PB

Q(ijm)[B xijm

where the summation is over all con gurations B isomorphic to AA more restricted form of parameter equating may also be useful when the nodes of the

random graph are hypothesized to fall into distinct classes or positions (as in an a prioriblockmodel see for example White et al 1976 Wasserman amp Anderson 1987Wasserman amp Faust 1994 Chapter 10) In this case the random graph nodes of thecon guration identi ed with the subset A may be regarded as coloured and two con g-urations A and B are de ned as isomorphic if there is a one-to-one mapping w on the nodesof N such that

1 (i j m) [ A if and only if (w (i) w ( j) m) [ B2 i and w (i) have the same colour3 j and w ( j) have the same colour

We then set lA = lB only if A and B are isomorphic (using this more restrictive de nition)

32 The multivariate p model

As mentioned we refer to equation (1) as the multivariate p model The parameters of themodel correspond to the cliques of the dependence graph D The suf cient statisticcorresponding to the parameter lA for clique A of D has the form

Q(ijm)[A xijm in the

case where homogeneity effects have been imposed the suf cient statistics are counts of suchvalues over cliques whose parameters are set to be equal

321 Introduction

The dependence structures for social networks described in the preceding section give rise to

Philippa Pattison and Stanley Wasserman176

Table 1 Some statistics and parameters for univariate and multivariate relations

Effect Parameter Graph statistic in lsquocount formrsquo

Single dichotom ous relationsChoice h fXMutuality r fXCcedilX 9

Transitivity t fXXCcedilXExpansiveness a i fXCcedilRi

Attractiveness b j fXCcedilCj

m-paths p m fXm

Subgroup density f [st] fXCcedildst

Subgroup mutuality r[st] fXCcedilX 9 Ccedildst

Subgroup transitivity t[sut] f(XCcedildsu)(XCcedildut)Ccedil(XCcedildst )

Multivariate relationsAssociation C fXCcedilYMultiplexity hkl fXkCcedilXl

Exchange rkl fXkCcedilX 9l

Generalized transitivity tklm f(XkXl )CcedilXm

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 9: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

network statistics identi ed in Table 1 In order more easily to de ne the statistics weintroduce a counting function f for an array Z as the sum of entries in the array fZ = P

ij ZijThe function f is a count of the number of distinct ordered pairs of nodes i and j for whichthere is a relational tie of type Z For convenience we refer to the parameter corresponding tofZ as hz

When homogeneity constraints are imposed we can represent the suf cient statistics in acompact form For the assumption of multiplex conditional dependencies any clique in N D

has the form

A = (i j m1) (i j m2) (i j mq)

thus in the homogeneous case the suf cient statistic for the multiplex parameter associatedwith the clique A is fZ where Z = Xm1

Ccedil Xm2Ccedil Ccedil Xmq

(Note that any non-empty subsetof relations gives rise to a clique of this form so that we also have statistics of the formfXm

fXkCcedilXl and so forth)

Reciprocity cliques of the form (i j m) (j i l) give rise to the exchange statistics fXkCcedilX 9l

Cliques in role-interlocking dependence structures lead to additional 2-path and 3-cyclestatistics of the form fXmXh

and f(XmXh)CcedilX 9n respectivelySome of the statistics for parameters re ecting row and column effects can be de ned using

the indicator matrices Ri and Cj whose elements are given by

(Ri)kl =1 if k = i

0 otherwise

(

(Cj)kl =1 if l = j

0 otherwise

(

In order to de ne statistics for the Markov random multigraph model let R k be any subsetof relations and de ne Yk as the intersection of the relations in R k The triad statisticcorresponding to a general multivariate triad has the general form fZ withZ = (Y1 Ccedil Y 9

4)(Y2 Ccedil Y 95) Ccedil (Y3 Ccedil Y 9

6) for some Y1 Y2 Y6When homogeneity is imposed only within S possible blocks or positions the network

statistics that arise correspond to within-block sums and can be represented by using thematrix dst with entries

(dst)ij =1 if i [ block s and j [ block t

0 otherwise

raquo

For example in the case of any homogeneous statistic fz the block-homogenous set ofstatistics is fZCcedildst

s = 1 2 S t = 1 2 SSome other network statistics and associated parameters are also presented in Table 1

This table also identi es the parameter labels used in Wasserman amp Pattison (1996) and theirgeneralizations to multivariate networks

Note that each of the statistics described above may be assumed to be homogeneous ormay be allowed to depend on some mutually exclusive and exhaustive partition ofactors or pairs of actors For example generalized transitivity statistics may be calculatedfor every triple of subgroups arising from a partition (for example f(XkCcedildsu)(XlCcedildut )Ccedil(XmCcedildst))and may be used to assess the homogeneity of generalized transitivity across subgroups

Logit models and logistic regressions for social networks II 177

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 10: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

322 The model

In combination with various homogeneity constraints model (1) can be written in thegeneral form

P(X = x) = exph 9 z(x)k(h)

(2)

where h is a vector of model parameters and z(x) is a vector of network statistics As we havedescribed these vectors depend on the structure of the hypothesized dependence graph andon whether any homogeneity constraints have been proposed

The model is of exponential family form that is the probability function depends on anexponential function of a linear combination of network statistics In some cases constraintson the elements of h are required in order to ensure a set of uniquely determined parameters(as we illustrate later with our examples) Usually the elements of h are unknown and must beestimated

The function k(h) in the denominator of model (2) is a normalizing quantity whose valueguarantees that the probability distribution is indeed proper summing to unity over thesample space of the random variable X (the set of all possible multivariate networks with rrelations and g actors)

Estimation of the parameters of models that assume only multiplexity andor generalizedreciprocity and exchange effects (as in the multivariate p1 model) is not particularly dif cultIn these cases the likelihood function is simply the product of the probabilities for eachmultivariate tie or dyad (for example see Wasserman 1987) Estimation of parameters of thegeneral multivariate p model is not straightforward however The likelihood function forthe parameters h of p depends on the complicated normalizing quantity k(h) which makesmaximum likelihood estimation dif cult except in special circumstances (such as dyadicindependence) and when the multigraphs are quite small (Walker 1995) In order forprobabilities to be computed one must be able to calculate k which is just too dif cult formost networks Hence alternative model formulations and approximate estimation techni-ques are important One such alternative which we now describe utilizes log-odds ratios ofthe conditional probabilities of each element of X

323 The logit model

We can turn model (2) into a generalized autologistic model for conditional probabilitiesgiving us an equivalence between model (2) and spatial models (Besag 1972 1974 Strauss1992) The step utilizes the dichotomous nature of the random variable Xijm and produces anapproximate likelihood function that is much easier to deal with

We rst condition on the complement of Xijm and consider just the probability that thedichotomous random variable Xijm is unity Recall that this variable records whether the tiefrom i to j of type m is present Speci cally consider

P(Xijm = 1 | Xcijm) = P(X = x+

ijm)

P(X = x+ijm) + P(X = x2

ijm)

= exph 9 z(x+ijm)

exph 9 z(x+ijm) + exph 9 z(x2

ijm) (3)

Philippa Pattison and Stanley Wasserman178

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 11: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

which has the advantage of not depending on the normalizing quantity We next consider theodds ratio which simpli es model (3)

P(Xijm = 1 | Xcijm)

P(Xijm = 0 | Xcijm)

= exph 9 z(x+ijm)

exph 9 z(x2ijm)

= exph 9 [z(x+ijm) 2 z(x2

ijm)] (4)

From this the log-odds ratio or logit model has the rather simple expression

v ijm = logP(Xijm = 1 | Xc

ijm)P(Xijm = 0 | Xc

ijm)

( )= h 9 [z(x+

ijm) 2 z(x2ijm)] (5)

If we de ne d(xijm) = [z(x+ijm) 2 z(x2

ijm)] then the logit model (5) simpli es succinctly tov ijm = h 9 d(xijm) The expression d(xijm) is the vector of network statistics that arises when thequantity xijm changes from 1 to 0 This version of the model is a logit p model for amultivariate network and is a generalized autologistic model (see Strauss 1992) applied tosocial network data

33 Estimation

The likelihood function for the general form of multivariate p model (2) is

L(h) = exph 9 z(x)k(h)

where the dependence on the normalizing quantity can easily be seen As mentionedmaximum likelihood of h is dif cult due to the size of the sample space

An approximate estimation approach proposed by Besag (1975 1977b) and adopted byStrauss (1986) Strauss amp Ikeda (1990) and Wasserman amp Pattison (1996) utilizes tools madepopular in models for rectangular lattices and spatial data speci cally we use the logitformulation and de ne the pseudo-likelihood function as

PL(h) =Y

iTHORN j

Yr

m=1

P(Xijm = 1 | Xcijm)xijm P(Xijm = 0 | Xc

ijm)12 xijm (6)

and a maximum pseudo-likelihoodestimator (MPLE) to be the value of h that maximizes (6)MPLEs are much easier to calculate than maximum likelihood estimators (MLEs) MPLEsdiffer from MLEs for all but the simplest models (those for which the conditionalprobabilities are indeed independent of the complement relation) Basically the approachassumes conditional independence of the random variables representing the multivariaterelational ties (for discussion of the issues in using maximum pseudo-like lihood rather thanmaximum likelihood estimation see Wasserman amp Pattison 1996 and Preisler 1993)

There is a large literature on the use of approximate likelihoods in spatial modellingDiggle (1996) reviews models for discrete spatial variation and notes that there are severalpossible estimation techniques He notes in his detailed discussion that MPLEs are moreef cient than other possibilities (which include the coding method of Besag 1974) Furtherfor moderately large samples the differences between MPLEs and MLEs are oftennegligible Small sample sizes and hence small networks (g lt 10) unfortunately areparticularly problematic

Logit models and logistic regressions for social networks II 179

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 12: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

In social network modelling Strauss amp Ikeda (1990) established that estimation of h forsingle dichotomous relations can be accomplished via logistic regression using anystandard logistic regression model- tting routine In particular they showed that maximizingthe pseudo-likelihood given in equation (6) is equivalent to maximizing the likelihoodfunction for the t of logistic regression to model (5) (for independent observations xijm)Further they observed that such logistic regressions can be tted using iteratively reweightedGaussndashNewton computational techniques as implemented by any logistic regression modelpackage

The proof of this result uses the fact that the derivatives of the pseudo-like lihood set equalto zero are identical to those obtained from a logistic regression with the relational variablesas data values Thus tting p can be done by using the logit p form and assuming that therelational variables are actually statistically independent The idea for this theorem was rstsuggested by Frank amp Strauss (1986) for estimation of the parameters in their triad modelThe generalization of this result to the three-way binary array X is straightforward

The evaluation of the t of multivariate p is not straightforward but it is helpful tocompare the observed values xijm with the tted values xijm The tted values as is commonwith dichotomous variables are de ned as xijm = P(Xijm = 1 | Xc

ijm) The estimated conditionalprobabilities are computed from

logit P(Xijm = 1 | Xcijm) = h 9 d(xijm)

Two useful indices of t are the psuedo-likelihood ratio statistic

G2PL = 2

Xxijm log(xijmxijm)

for a model and the mean of the absolute value of the residuals (xijm 2 xijm) In the examplesbelow we report both G2

PL and the mean absolute residual Unfortunately as with allother uses of this MPLE approach the distribution of G2

PL is unknown even asymptoticallyand there is no straightforward way of estimating the standard errors of parameterestimates (although asymptotic standard errors calculated from logistic regression modelscan give approximate guidance to the modeller) Crouch amp Wasserman (1998) give somepreliminary results comparing MPLEs to MLEs and report the optimistic nding that formoderately large networks (g gt 10) both standard errors and test statistics based on thepseudo-likelihood approach are quite close to those based on the exact likelihood

34 Computational details

Maximum pseudo-like lihood estimates of the parameters of model (1) are obtained by ttingthe logistic regression model (5) In order to t model (5) we compute for each relational tiethe values of the lsquoexplanatory variablesrsquo z(x+

ijm) 2 z(x2ijm) corresponding to each statistic z(x)

we then use these as the observed explanatory variables for the realization of Xijm (thelsquoresponse variablersquo) in the logistic regression corresponding to model (5)

The computation of the values z(x+ijm) 2 z(x2

ijm) is simple but it is useful to note that thevalues may take a different form for the various types of relational ties (corresponding to thesubscript m of Xijm) For example suppose that there are two relations X l and Xhrespectively and consider the parameter corresponding to the triadic effectZ = (XlXh) Ccedil Xh If we assume homogeneity then the suf cient statistic for this parameteris fZ For the two relations the computed values of the explanatory variable for this triadic

Philippa Pattison and Stanley Wasserman180

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 13: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

effect are equal to the changes in the statistic fZ when xijm changes from 1 to 0 for m = l or hThus when m = l (corresponding to the values for the rst relation Xl) we computeP

k xikhxjkh as the value of the explanatory variable corresponding to this parameter andwhen m = h (corresponding to an Xh tie) we compute

Pk xiklxkjh +

Pk xkilxkjh

4 Examples

We illustrate the construction and tting of multivariate p models using two examples

41 The Grade 7 peer network

The rst example is an extension of the data analysed by Wasserman amp Pattison (1996)Vickers (1981) and Vickers amp Chan (1981) obtained network data from 29 students in grade 7in a school in Victoria Australia They asked students to nominate their classmates on anumber of relations including the following

1) Who are your best friends in the class2) Who would you rather not have as a friend

We label the relations de ned by these two questions as XB (relation 1) and XN (relation 2)and their associated matrices as B and N respectively The matrix for the lsquobest friendsrsquorelation is given here as our Table 2 and the matrix for the lsquonot friendsrsquo relation as ourTable 3 As noted by Wasserman amp Pattison (1996) actors 1ndash12 are boys while actors 13ndash29are girls

In Wasserman amp Pattison (1996) we analysed the relation XB and established that itpossessed strong reciprocity and transitivity effects Here we t models simultaneously to therelations XB and XN in an attempt to model their mutual interdependence Our models use themethodology described earlier and are guided by the literature that has speculated on thestructure of positive and negative affect ties (see the discussion in Wasserman amp Faust 1994Chapter 6 on signed graphs) we also compare our models to previous descriptive analyses ofsimilar types of ties We report the t of a number of homogeneous models

Models 1a and 1b ndash independence We rst t two versions of a complete independencemodel in which we make the (implausible) assumption that all observed ties are independentIn the rst version of the model we allow a single separate lsquochoicersquo parameter hz (where Zmay be either B or N ) for each type of relation in the second more restricted version weassume a single common choice parameter In both versions of the model the maximalcliques of the dependence graph have the form (i j m) in model 1a the parameterscorresponding to this clique are assumed to depend on relation m (but not on actor i or j)whereas in model (1b) the parameter is assumed constant for all i j and m The suf cientstatistics for model (1a) are fB and fN model 1b has suf cient statistic fB+N The t of models1a and 1b is summarized in Table 4 Neither model provides a good t with the mean of theabsolute residuals equal to approximately 037 Since model 1b is nested in model 1a thedifference between the pseudo-like lihood ratio statistics is of interest and we note that model1b appears to be no worse a t than model 1a (DG2

PL = 32 and the models differ by oneparameter)

Logit models and logistic regressions for social networks II 181

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 14: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

Model 2 ndash multiplexity Model 2 is a multiplexity model with maximal cliques(i j 1) (i j 2) The model allows for the possibility that an XB tie from i to j is conditionallydependent on an XN tie from i to j The parameters of the model have the form hz where Zmay be B N or B Ccedil N the corresponding suf cient statistics are fB fN and fBCcedilNrespectively Thus this model adds a single multiplex parameter hBCcedilN to the two choiceparameters in model 1a Model 2 appears to be a substantial improvement over model 1a(DG2

PL = 2537 with one additional parameter) but the small frequency of B Ccedil N ties impliesthat the MPLE of its corresponding parameter is likely to have a large standard error

Models 3a and 3b ndash reciprocity and exchange Model 3 assumes bivariate dyad indepen-dence (as described by Wasserman 1987) and has maximal cliques(i j 1) (i j 2) (j i 1) (j i 2) We t two restricted versions of the model rst model3a in which only choice and reciprocity effects are assumed (with parameters hz forZ = B N B Ccedil B 9 and N Ccedil N9 ) and second model 3b with an additional exchange para-meter hz for the relation Z = B Ccedil N 9 In model 3a the presence of an XB tie from i to j isassumed to be conditionally dependent on the presence of an XB tie from j to i (that is on thepresence of reciprocity) similarly for XN ties Model 3b allows in addition the presence ofan XB tie from i to j to be conditionally dependent on the presence of an XN tie from j to i (that

Philippa Pattison and Stanley Wasserman182

Table 2 Vickers amp Chanrsquos (1981) network data lsquobest friendsrsquo relation

0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 1 1 0 1 0 1 1 0 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 00 0 0 1 0 0 1 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 0 1 0 1 0 1 1 0 0 0 1 1 0 0 1 0 1 0 0 0 0 0 0 1 01 1 1 1 1 0 0 1 0 1 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 0 0 1 1 1 1 1 1 1 1 0 0 1 1 1 0 1 0 0 1 1 11 1 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 00 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 1 1 1 1 1 1 0 1 0 1 1 1 0 1 0 0 1 1 1 1 0 0 0 0 0 0 01 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 1 0 1 1 1 1 0 1 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 1 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 1 1 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 1 1 1 1 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 0 1 1 1 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 1 1 0 1 1 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 1 1 1 0 1 0 0 1 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 1 0 0 1 1 1 1 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 1 1 0 1 0 1 0 0 0 1 0 0 1 1 1 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 0 0 0 0 10 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 0

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 15: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

is on the exchange of an XN tie for an XB one) We have not tted the most generalhomogeneous dyad-independence model which includes multiplexity parameters since Band N co-occur only rarely (and as a result it is dif cult to t parameters corresponding torelations such as B Ccedil N B Ccedil N Ccedil B 9 and so forth) The t statistics in Table 4 indicate thatnot only is model 3a a substantial improvement over model 1a (DG2

PL = 2086 with just twoadditional parameters) but also that model 3b provides a marginally better t than model 3a

Logit models and logistic regressions for social networks II 183

Table 3 Vickers amp Chanrsquos (1981) network data lsquonot friendsrsquo relation

0 0 0 0 0 0 1 0 1 1 0 0 1 0 1 0 0 1 0 1 0 0 1 0 1 0 0 1 10 0 0 0 0 0 0 0 1 0 0 1 1 0 0 0 1 1 1 1 0 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 1 1 1 0 0 1 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 1 1 0 0 1 0 1 0 0 1 0 0 0 0 1 1 0 1 0 0 0 1 1 1 0 0 00 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 1 0 0 0 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 1 1 0 0 0 1 0 0 1 0 1 0 0 0 0 1 1 0 0 0 0 0 1 0 01 0 1 1 0 0 1 0 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 10 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 00 0 0 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 1 1 10 0 1 1 0 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 00 0 0 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 1 0 0 1 0 0 0 0 0 1 00 0 1 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 1 0 0 0 0 1 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 10 0 0 0 0 0 1 0 0 1 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 1 11 0 1 1 0 0 0 0 1 1 0 1 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 00 1 0 0 1 1 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 00 0 0 0 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 0 0 0 0 0 1 01 1 1 1 1 1 1 1 1 1 1 1 1 0 0 0 0 1 0 0 0 0 0 1 1 0 0 1 11 0 0 1 0 0 1 1 0 1 0 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 01 0 1 0 0 1 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 01 1 1 1 1 1 1 1 1 1 1 0 0 0 0 0 0 1 0 0 0 0 0 0 1 0 0 0 0

Table 4 Summary of t of models 1andash5b to the grade 7 peer network

Model No of parameters G2PL Mean absolute residual

1a 2 17941 03661b 1 17973 03672 3 15404 03293a 4 15855 03153b 5 15584 03114 13 15110 03005a 19 12206 02415b 23 10323 0196

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 16: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

(DG2PL = 271 with one additional parameter) These gures suggest the presence of both

reciprocity and exchange effects Note though that the t of model 3b is still not particularlygood with the mean of the absolute residuals equal to 0311

Model 4 ndash path dependence Model 4 is a path-dependent model and assumes that a tie ofany type from i to j may be conditionally dependent on ties of any type from j to some thirdindividual k Maximal cliques therefore have the form (i j m) (j i h) or(i j m) (j k h) (k i p) parameters and suf cient statistics are given by hz and fZrespectively where Z may be any of the relations B N B Ccedil B9 N Ccedil N 9 B Ccedil N 9 BB BN NB NN BB Ccedil B 9 BN Ccedil B 9 BN Ccedil N 9 and NN Ccedil N 9 Compared to model 3bmodel 4 adds only marginally to the t (DG2

PL = 474 with eight additional parameters)

Models 5a and 5b ndash restricted Markov random graph models The nal set of models arepath-dependent models with additional dependencies assumed on substantive grounds Allmodels have the model 4 parameters in addition model 5a possesses dependenciesconsistent with the transitivity-like hypothesis that friends are likely to agree on theirrelations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XB from i to j i to k and j to k and also between relational ties oftype XB from i to j of type XN from i to k and of type XN from j to k) Model 5b possessesadditional dependencies consistent with the claim that non-friends are likely to disagree ontheir relations with third parties (hence likely pairwise conditional dependencies betweenrelational ties of type XN from i to j of type XN from i to k and of type XB from j to k and alsobetween relational ties of type XN from i to j of type XB from i to k and of type XN from j to k(See Johnsen (1986) for a review and analysis of the literature on the structure of affectiveties and Pattison (1993) for an algebraic translation of these structural claims) Model 5aadds (i j 1) (j k 1) (i k 1) and ((i j 1 )(j k 2) (i k 2) to the set of maximal cliques formodel 4 model 5b also adds (i j 2) (j k 1) (i k 2) and (i j 2) (j k 2) (i k 1) We notethat all of the subcliques of these additional maximal cliques have corresponding parametersin models 5a and 5b these additional subcliques correspond to various forms of stars(i j m) (i k h) (i j m) (k j h) and (i j m) (j k h) As indicated in Table 4 theadditional dependencies assumed by model 5a lead to a substantial improvement over thesimple path-dependent model 4 (DG2

PL = 2904 with six additional parameters) and thoseassociated with model 5b lead to a modest further improvement in t (DG2

PL = 1883 withfour additional parameters) The mean of the absolute residuals for model 5b is 0196suggesting a more reasonable t to the data (but one that could lend itself to further possibleimprovement)

The MPLEs for the parameters of model 5b are displayed in Table 5 Positive estimateswere observed for both reciprocity parameters and for the parameters associated with three ofthe four additional hypothesized dependencies Thus the conditional odds of a tie of any typeappear to be enhanced if a reciprocal tie of the same type is present if the tie completes one ofthe expected triadic structures for agreement between friends or if the tie completes a triad inwhich an individual would rather not have as a friend any friend of someone who has beenindicated as a non-friend Negative estimates were obtained for the exchange parameter for2-stars comprising two incoming XB ties and for 3-cycles comprising XB ties Thus theconditional odds of a tie of any type appear to be reduced by the presence of a reciprocated tieof the other type in addition the odds of a XB tie being directed to a particular individual are

Philippa Pattison and Stanley Wasserman184

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 17: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

reduced if other XB ties are also directed to the same individual or if the tie completes a3-cycle of XB ties

42 Padgett amp Ansellrsquos Florentine network

Our second example is an analysis of marriage and business ties among groups of Florentinefamilies (Padgett amp Ansell 1993) In an analysis of the rise to power of the Medici family inFlorence in the early fteenth century Padgett amp Ansell constructed a number of networkrelations among 33 groups of elite families including marriage and business or economicties The construction was based on a coding of various types of network relations among a92-family ruling elite from Kentrsquos (1978) description of the network foundations of theMedici party and their opponents Padgett amp Ansell used marriage and economic networks toderive a clustering of the 92 families into 33 family groups (using the CONCOR algorithmsee Breiger Boorman amp Arabie 1975) they then coded a relation of a particular typebetween two family groups if there were at least two pairs of families with one family fromeach group linked by a relation of that type The analysis presented below is for marriage andeconomic relations among these 33 family groups shown in gure 2a of Padgett amp Ansell(1993) for the purpose of the analysis reported below within-group relationships areignored and the various types of economic ties are aggregated into a single business

Logit models and logistic regressions for social networks II 185

Table 5 Parameter estimates for model 5b tted to the grade 7 peer network

Model parameter Z hZ Approximate standard error

1-paths B 2 181 076(choice) N 2 239 065

2-cycles B Ccedil B 9 253 037(reciprocity amp N Ccedil N 9 061 026exchange) B Ccedil N 9 2 067 028

2-paths BB 001 005BN 2 003 004NB 2 011 004NN 002 004

3-cycles BB Ccedil B9 2 072 014BN Ccedil B 9 005 008BN Ccedil N9 003 007NN Ccedil N 9 2 005 009

2-stars BB 9 2 036 008BN 9 2 008 004NN 9 006 004B 9 B 2 001 004B 9 N 2 004 003N 9 N 007 002

Additional BB Ccedil B 057 006hypothesized BN Ccedil N 017 005constraints NB Ccedil N 033 005

NN Ccedil B 2 009 006

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 18: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

economic relation Thus a marriage tie is coded from one group to another if a woman of the rst group is married to a man in the second a businesseconomic tie signi es the presence oftrading or partnership relationships the sharing or renting of real estate or a bank employ-ment relation (see Padgett amp Ansell 1993 pp 1265 ndash1266)

Padgett amp Ansell used the interconnections among social and demographic factors theserelational ties and actions on the part of Cosimo dersquo Medici to explain the source of thelatterrsquos extraordinary power here we examine the joint network structure of the marriage andbusinesseconomic ties

We label the relations studied by Padgett amp Ansell as XB (business ties) and XM (marriageties) Their associated matrices are B and M respectively

In Table 6 we report the t of six classes of models similar in construction to thosereported for the grade 7 peer network As for the grade 7 peer network models 1a and 1b aretwo- and one-parameter complete independence models respectively and model 2 is amultiplexity model It is clear from Table 6 that there is little improvement in t of the two-parameter choice complete independence model (model 1a) over the one-parameter choicemodel (model 1b) (DG2

PL = 07 with one extra parameter) in addition permitting depen-dencies among marriage and business ties for the same individuals does little to improvemodel t (DG2

PL = 04 for model 2 compared to model 1a) Models 3a and 3b are reciprocityand exchange models Model 3a adds to model 1a the reciprocity effects for XB and XM tiesmodel 3b further adds the exchange effect that allows conditional dependence of a marriagetie from i to j and a business tie from j to i The reciprocity effects in model 3a lead to asubstantial improvement in t over model 1a (DG2

PL = 1640 with two additional para-meters) but no further improvement is achieved by permitting the dyadic exchange ofmarriage and business ties (DG2

PL = 02) Model 4 is a path-dependent model and is amarginal improvement in t over model 3b (DG2

PL = 451 with six additional parameters)Parameters corresponding to cycles with two or more business ties were excluded from themodel because of the infrequency of occurrence of such structures

Since as Padgett amp Ansell (1993) note the gaining of hierarchical status was the primaryconsideration in the arrangement of marriage ties between elite families we might expectmarriage ties to exhibit a tendency towards transitivity Hence model 5a assumes in addition

Philippa Pattison and Stanley Wasserman186

Table 6 Summary of t models 1andash6d to the Florentine network

Model No of parameters G2PL Mean absolute residual

1a 2 4872 00481b 1 4879 00482 3 4868 00483a 4 3232 00323b 5 3230 00324 11 2779 00295a 18 2437 00265b 17 2463 00266a 21 2279 00266b 23 2267 00266c 23 2252 00266d 23 2170 0025

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 19: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

to conditional dependencies for paths of length 2 pairwise conditional dependenciesamong marriage ties from i to j j to k and i to k (and hence adds a parameter correspondingto the relation X = MM Ccedil M) Further all possible stars comprising two relations areadded as well in order to investigate possible interdependencies between marriage andbusiness ties that are not evident at the level of ties from an actor i to an actor j (see thecomparison between the complete independence model 1a and the multiplex model 2) Thesedependencies also require various star parameters hz for Z equal to MM 9 M 9 M M 9 B andBB 9

The t of model 5a was a modest improvement over that of model 4 (DG2PL = 342 with

six additional parameters) The estimated parameter corresponding to the relation MM Ccedil Mis not large so in model 5b the parameter is removed with little effect on the t of the model(DG2

PL = 26)A nal set of models tted to the data investigated the possibility of structural differences

in ties according to party af liation As Padgett amp Ansell (1993) observed the rst 10 familygroups are substantially identi ed with the Medici party (the Medici family themselvescomprising group 1) whereas the remaining groups of families are not Padgett amp Anselldescribed the remarkable structural differences between the network of relations within theMedici party and within the remaining (largely oligarchic) set Models 6andash6d therefore allowvarious model 5b parameters to differ according to whether they refer to ties lying eitherwithin the collection of Medici blocks to ties connecting non-Medici blocks or to tiescrossing the boundary between the two collections of blocks Model 6a allows such variationfor the density parameter and is a substantial improvement over model 5b (DG2

PL = 184 withfour additional parameters) Model 6b permits the parameters for lsquomixedrsquo out-stars compris-ing marriage and business ties to differ for the three types of blocks and is not a substantialimprovement over model 6a (DG2

PL = 14) Model 6c allows heterogeneity across blocks inthe parameters for 2-paths comprising marriage and business ties it also fails to improve tcompared to model 6a (DG2

PL = 25) The nal model 6d permits heterogeneity acrossblocks in the parameters for paths comprising two marriage ties in this case there is a modestimprovement in t compared to model 6a (DG2

PL = 108 with two additional parameters)The estimated parameters for model 6d are shown in Table 7 The estimates suggest a

strong tendency for reciprocated business ties a tendency that is unsurprising given the formof business or economic ties such as partnerships There are weaker tendencies for theexistence of 2-paths comprising either marriage or business ties marriage ties also appear tobe more likely if they complete a cycle of three marriage ties Padgett amp Ansell (1993) notedthe presence of these cycles and analysed both their development and their consequencesthey make a compelling argument for their importance to the evolving structure of theoligarchy It can also be seen from Table 7 that path structures in which an outgoing marriagetie is accompanied by an incoming business tie reduce the likelihood of the overall structureEstimates of star parameters suggest the prevalence of heterogeneous stars in which a groupof families have marriage ties with one group and business ties with another The parameterestimates for homogeneous marriage in-stars and out-stars are both negative there appears tohave been a reduced conditional probability of a marriage tie to a family group if some othergroup also had such a tie and to a lesser extent if the rst family group had another outgoingmarriage tie

The parameters for block-dependent densities suggest an enhanced likelihood ofmarriage ties within the Medici collection of family groups and to a lesser extent within

Logit models and logistic regressions for social networks II 187

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 20: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

the non-Medici collection marriage ties between the two types of family groups were lesslikely Business ties exhibit a substantially weaker pattern of the same form Together thesecharacteristics of the network re ect what Padgett amp Ansell noted was a remarkableinterdependence of marriage and economic ties on the one hand and political partisanshipon the other and they support their conclusion that the microstructure of marriage andeconomics was central to the formation of parties in Florence (1993 p 1277) The block-dependence of marriage 2-paths takes a different and interesting form such paths are lesslikely to link a pair of family groups within the Medici collection than a pair within the non-Medici collection and they are even more likely to link family groups of different types Thegroup containing members of the Medici family is the major contributor to this pattern asthey are the only Medici group with marriage connections outside the collection mobilizedinto the Medici party Note that this structural effect is tted at the same time as the cyclicpattern for marriage ties so that although as Padgett amp Ansell noted there are many moretwo-step marriage connections for non-Medici than for Medici partisans many of the former

Philippa Pattison and Stanley Wasserman188

Table 7 Parameter estimates for model 6d tted to the Florentine network

Model parameter Z hZ Approximate standard error

1-paths M 2 517 102(choice) B 2 737 125

2-cycles M Ccedil M 9 095 094(reciprocity and B Ccedil B 9 1033 172exchange) M Ccedil B 9 065 108

2-paths MM 066 032MB 016 038BM 2 084 037BB 126 095

3-cycles MM Ccedil M 9 212 061MB Ccedil M 9 2 035 085

2-stars MM 9 2 155 037M 9 M 2 043 020BB 9 2 153 108B 9 B 2 085 099MB 9 2 014 036M 9 B 092 035

subgroup-dependen t M effects1-paths within Medici 371 1121-paths between subgroups 2 467 1921-paths within other subgroups 096

subgroup-dependen t B effects1-paths within Medici 070 1061-paths between subgroups 2 080 0871-paths within other subgroups 010

subgroup-dependen t MM effects2-paths within Medici 2 133 0462-paths between subgroups 108 0442-paths within other subgroups 025

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 21: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

connections constitute cycles within the non-Medici collection (hence the larger estimate forthe 2-path parameter for between-collection ties)

Thus model 6d provides a parametric description of the network of marriage and businessties among Florentine family groups that re ects many of the key features of the networkexplicated in Padgett amp Ansellrsquos detailed account

5 Conclusion

The multivariate p model is very general in form and has great potential for developingparsimonious and faithful models for multivariate social relations as the applicationspresented here are intended to illustrate Further we expect that extensions to longitudinalmultivariate data will be worthwhile and relatively straightforward for preliminary steps seeRobins (1998) Such extensions are common in closely related spatial modelling applications(for example Preisler 1993)

In addition to these proposed extensions we believe that there are several questionsspeci c to the modelling of social networks that deserve future close attention The rst isapparent from the analyses presented here and in Wasserman amp Pattison (1996) and concernsthe choice of suitable explanatory statistics from the large number of possibilities Theproblem is particularly important because of the interdependence of many of the networkstatistics we have used and is exacerbated when the number r of relations is large What isneeded is some principled means of making choices among possible explanatory statistics Ofcourse the most useful direction is likely to come from the substantive questions guiding thenetwork research ndash much can be gained by allowing substantive hypotheses to guidemodelling endeavours such as those described here We refer the reader to recent applicationsof these methods to substantive problems (Contractor amp Wasserman 1999 Lazega ampPattison 1998 Lomi amp Pattison 1998) for some illustrations It is clear that a more generalstructural framework for classes of explanatory network statistics would also be useful

One possible basis for such a framework already resides in existing attempts to describe theinterdependence of network relations These descriptions have been algebraic in characterfocusing on the interdependence of labelled paths constructed from multiple social relations(for example Boorman amp White 1976 Boyd 1991 Pattison 1993) or of more generalconnectivity structures (for example Doreian 1980 1986) One of the limitations of theseapproaches is their lack of a stochastic basis hypotheses about speci c constraints placed ona set of network relations by an algebraic model cannot readily be evaluated

Thus a useful next step we argue is to formalize the relationship between the algebraicstructure of path interdependencies and classes of possible network statistics for use in the pframework A link between these network statistics and the algebraic expression of pathinterdependencies is made possible through the class of network statistics we have describedhere We have demonstrated how hypothesized conditional dependencies among paths (suchas some form of generalized transitivity) correspond to some algebraic rule Thus theproblem of choosing a suitable collection of explanatory statistics is closely related to thatof identifying appropriate algebraic path interdependencies or constraints As PattisonWasserman Robins amp Kanfer (in press) have noted there are a number of hypotheses in thesocial network literature about such constraints in addition some useful exploratory methodshave been developed (for example Pattison amp Wasserman 1995) The particular advantageto the expression of these kinds of constraints in the form z(x) of explanatory variables for p

Logit models and logistic regressions for social networks II 189

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 22: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

models is that each hypothesized constraint may be parameterized and evaluated marginal toother such constraints As a result it should indeed be possible to construct principled andparsimonious descriptions of network structure which can be tested statistically

A second line of enquiry that we believe will be particularly fruitful to the development ofthe class of p models that we have described is the further exploration of techniques forassessing the homogeneity of network effects As noted earlier any effect such as some formof generalized transitivity may be assumed to be homogeneous (which is usually a good nullhypothesis) or it may be permitted to vary across different lsquopartsrsquo of the network (and in thislatter case the null hypothesis of homogeneity may be evaluated at least approximately withan alternative hypothesis allowing heterogeneity) We believe that in the literature onalgebraic models for multivariate networks there is a second tradition that can usefullyguide such statistical developments Local structural descriptions based on the interdepen-dencies among paths emanating from (or leading to) each individual in the network (forexample Mandel 1983 Pattison 1989 1993 Pattison amp Wasserman 1995) describeheterogeneity across individuals Thus a useful next step in the application of p modelsis the articulation of the homogeneity of effects in terms of these local algebraic descriptions

Finally an important next step is to address the problems of model evaluation associatedwith the use of MPLEs Several directions are likely to be useful First Preisler (1993)described how a parametric bootstrap method may be used to estimate standard errors forparameter estimates The approach involves simulating the tted p model using theMetropolis ndashHastings algorithm Second Geyer amp Thompson (1992) have shown in generalhow Markov Chain Monte Carlo methods may be used to nd maximum likelihood parameterestimates for models involving complicated dependence structures preliminary steps in thisdirection for the p class of models have been reported by Crouch amp Wasserman (1998)

Acknowledgements

This research was supported by grants from the Australian Research Council the National ScienceFoundation (SBR96-30754) and the National Institute of Health (PHS-1R01-39829-01) Specialthanks go to Sarah Ardu for programming assistance and Ron Breiger Brad Crouch Laura KoehlyJohn Padgett and Garry Robins for helpful comments We are also grateful to the editor and tworeferees for their help in improving this paper

References

Besag J E (1972) Nearest-neighbour systems and the auto-logistic model for binary data Journal ofthe Royal Statistical Society Series B 34 75ndash83

Besag J E (1974) Spatial interaction and the statistical analysis of lattice systems (with discussion)Journal of the Royal Statistical Society Series B 36 196ndash236

Besag J E (1975) Statistical analysis of non-lattice data The Statistician 24 179ndash195Besag J E (1997a) Some methods of statistical analysis for spatial data Bulletin of the International

Statistical Association 47 77ndash92Besag J E (1977b) Ef ciency of pseudo-likelihood estimation for simple Gaussian random elds

Biometrika 64 616ndash618Boorman S A amp White H C (1976) Social structure from multiple networks II Role structures

American Journal of Sociology 81 1384 ndash1446Boyd J P (1991) Social semigroups A unied theory of scaling and blockmodelling as applied to

social networks Fairfax VA George Mason University PressBreiger R L Boorman S A amp Arabie P (1975) An algorithm for clustering relational data with

Philippa Pattison and Stanley Wasserman190

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 23: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

applications to social network analysis and comparision with multidimensional scaling Journalof Mathematical Psychology 12 328ndash383

Coleman J S Katz E amp Menzel H (1966) Medical innovation A diffusion study IndianapolisBobbs-Merrill

Contractor N amp Wasserman S (1999) A new framework for testing hypotheses about social networktheories Paper presented at the 1999 International Network for Social Network Analysis AnnualMeeting Charleston SC February

Cox DR amp Wermuth N (1996) Multivariate dependencies ndash Models analysis and interpretationLondon Chapman amp Hall

Crouch B amp Wasserman S (1998) Fitting p Monte Carlo maximum likelihood estimation Paperpresented at the 1998 International Network for Social Network Analysis Annual MeetingSitges Spain May

Davis J A (1968) Statistical analysis of pair relationships Symmetry subjective consistency andreciprocity Sociometry 31 102ndash119

Diggle P J (1996) Spatial analysis in biometry In P Armitage amp H A David (Eds) Advances inbiometry New York Wiley

Doreian P (1980) On the evolution of group and network structure Social Networks 2 235ndash252Doreian P (1986) On the evolution of group and network structure II Structures within structures

Social Networks 8 33ndash64Edwards D (1995) Introduction to graphical modeling New York Springer-Verlag Fienberg S E amp Wasserman S (1981) Categorical data analysis of single sociometric relations In S

Leinhardt (Ed) Sociological methodology 1981 pp 156ndash192 San Francisco Jossey-BassFienberg S E Meyer M M amp Wasserman S (1981) Analyzing data from multivariate directed

graphs An application to social networks In V Barnett (Ed) Interpreting multivariate datapp 289ndash306 Chichester Wiley

Fienberg S E Meyer M M amp Wasserman S (1985) Statistical analysis of multiple sociometricrelations Journal of the American Statistical Association 80 51ndash67

Frank O (1987) Multiple relation data analysis In H Iserman G Merle U Reider R Schmidt ampL Streitferdt (Eds) Operations research proceedings 1986 pp 455ndash460 BerlinHeidelbergSpringer-Verla g

Frank O (1991) Statistical analysis of change in networks Statistica Neerlandica 45 283ndash293Frank O (1997) Composition and structure of social networks Mathematiques Informatique et

Science Humaines 137 11ndash23Frank O Lundquist S Wellman B amp Wilson C (1986) Analysis of composition and structure of

social networks Unpublished manuscriptFrank O amp Nowicki K (1993) Exploratory statistical analysis of networks In J Gimbel J W

Kennedy amp L V Quintas (Eds) Quo Vadis Graph Theory Annals of Discrete Mathematics 55349ndash366

Frank O amp Strauss D (1986) Markov graphs Journal of the American Statistical Association 81832ndash842

Galaskiewicz J amp Marsden P V (1978) Interorganizationa l resource networks Formal patterns ofoverlap Social Science Research 7 89ndash107

Geyer C J amp Thompson E A (1992) Constrained Monte Carlo maximum likelihood for dependentdata Journal of the Royal Statistical Society Series B 54 657ndash699

Holland P W amp Leinhardt S (1973) The structural implications of measurement error in sociometryJournal of Mathematical Sociology 3 85ndash111

Holland P W amp Leinhardt S (1981) An exponential family of probability distributions for directedgraphs (with discussion) Journal of the American Statistical Association 76 33ndash65

Hubert L J amp Baker F B (1978) Evaluating the conformity of sociometric measurementsPsychometrika 43 31ndash41

Iacobucc i D (1989) Modeling multivaria te sequenti al dyadic interact ions Social Networks 11315ndash362

Iacobucci D amp Wasserman S (1987) Dyadic social interactions Psychological Bulletin 102 293ndash306

Ising E (1925) Beitrag zur Theorie des Ferromagnetism us Zeitscrhift fur Physik 31 253ndash258

Logit models and logistic regressions for social networks II 191

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 24: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

Johnsen E C (1986) Structure and process Agreement models for friendship formation SocialNetworks 8 257ndash306

Katz L amp Powell J H (1953) A proposed index of the conformity of one sociometric measurement toanother Psychometrika 18 249ndash256

Kent D (1978) The rise of the MediciFaction in Florence1426ndash1434 Oxford Oxford University PressLauritzen S (1996) Graphical models Oxford Oxford University PressLazega E amp Pattison P (1998) Social capital multiplex generalized exchange and cooperation in

organizations A case study Paper presented at the 1998 International Network for SocialNetwork Analysis Annual Meeting Sitges Spain May

Lee N H (1969) The search for an abortionist Chicago University of Chicago PressLeifer E M (1988) Interaction preludes to role setting Exploratory local action American Socio-

logical Review 53 865ndash878Lomi A amp Pattison P (1998) Multivariate p models of producersrsquomarket structure Paper presented

at the 1998 International Network for Social Network Analysis Annual Meeting Sitges Spain MayLorrain F P amp White H C (1971) Structural equivalence of individuals in social networks Journal

of Mathematical Sociology 1 49ndash80Mandel M (1983) Local roles and social networks American Sociological Review 48 376ndash386Mayer A C (1977) The signi cance of quasi-groups in the study of complex societies In S Leinhardt

(Ed) Social networks A developing paradigm pp 293ndash318 New York Academic PressMerton R K (1957) Social theory and social structure New York Free PressMichaelson A G (1990) Network mechanisms underlying diffusion processes Interaction and

friendship in a scienti c community Doctoral thesis School of Social Science University ofCalifornia Irvine

Nadel S F (1957) The theory of social structure Melbourne Melbourne University PressPadgett J F amp Ansell C K (1993) Robust action and the rise of the Medici 1400 ndash1434 American

Journal of Sociology 98 1259 ndash1319Parsons T (1966) The structure of social action New York Free PressPattison P (1989) Mathematical models for local social networks In J A Keats R Taft R A Heath

amp S H Lovibond (Eds) Mathematical and theoretical systems pp 139ndash149 AmsterdamNorth-Holland

Pattison P (1993) Algebraic models for social networks New York Cambridge University PressPattison P amp Wasserman S (1995) Constructing algebraic models for local social networks using

statistical methods Journal of Mathematical Psychology 39 57ndash72Pattison P Wasserman S Robins G amp Kanfer A M (in press) Statistical evaluation of algebraic

constraints for social relations Journal of Mathematical PsychologyPreisler H (1993) Modeling spatial patterns of trees attacked by bark-beetles Applied Statistics 42

501ndash514Rennolls K (1995) p12 In M G Everett amp K Rennolls (Eds) Proceedings of the 1995 International

Conference on Social Networks 1 pp 151ndash160 London Greenwich University PressRobins G L (1998) Personal attributes in inter-personal contexts Statistical models for individual

characteristics and social relationships PhD dissertation Department of Psychology Universityof Melbourne

Strauss D (1986) On a general class of models for interaction SIAM Review 28 513ndash527Strauss D (1992) The many faces of logistic regression American Statistician 46 321ndash327Strauss D amp Ikeda M (1990) Pseudolikelihood estimation for social networks Journal of the

American Statistical Association 85 204ndash212Vickers M (1981) Relational analysis An applied evaluation MSc thesis Department of Psychology

University of MelbourneVickers M amp Chan S (1981) Representing classroom social structure Melbourne Victoria Institute

of Secondary EducationWalker M E (1995) Statistical models for social support networks Application of exponential

models to undirected graphs with dyadic dependencies Doctoral dissertation Department ofPsychology University of Illinois

Wasserman S (1978) Models for binary directed graphs and their applications Advances in AppliedProbability 10 803ndash818

Philippa Pattison and Stanley Wasserman192

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193

Page 25: Logit models and logistic regressions for social …...Fienberg, Meyer & Wasserman (1985), Wasserman (1987), Iacobucci & Wasserman (1987) and Iacobucci (1989) extended thep1family

Wasserman S (1987) Conformity of two sociometric relations Psychometrika 52 3ndash18Wasserman S amp Anderson C (1987) Stochastic a priori blockmodels Construction and assessment

Social Networks 9 1ndash36Wasserman S amp Faust K (1994) Social network analysis Methods and applications New York

Cambridge University PressWasserman S amp Pattison P (1996) Logit models and logistic regressions for social networks I An

introduction to Markov graphs and p Psychometrika 60 401ndash425Wasserman S amp Pattison P (in press) Multivariate random graph distributions Lecture Notes in

Statistics New York Springer-Verla gWellman B Frank O Espinoza V Lundquist S amp Wilson C (1991) Integrating individual

relational and structural analyses Social Networks 13 223ndash249White H C (1963) The anatomy of kinship Mathematical models for structures of cumulated social

roles Englewood Cliffs NJ Prentice HallWhite H C (1977) Probabilities of homomorphic mappings from multiple graphs Journal of

Mathematical Psychology 16 121ndash134White H C Boorman S A amp Breiger R L (1976) Social structure from multiple networks I

Blockmodels of roles and positions American Journal of Sociology 81 730ndash780Whittaker J (1990) Graphical models in applied statistics Chichester WileyWinship C amp Mandel M (1983) Roles and positions A critique and extension of the blockmodeling

approach In S Leinhardt (Ed) Sociological methodology 1983ndash1984 pp 314ndash344 SanFrancisco Jossey-Bass

Received 29 May 1997 revised version received 2 February 1999

Logit models and logistic regressions for social networks II 193