38
Coauthorship Networks and Research Output Chih-Sheng Hsie 1 , Michael D. K¨ onig 2 Xiaodong Liu 3 and Christian Zimmermann 4 Faculty Lunch Seminar University of Zurich 10 th April 2017 1 Department of Economics, Chinese University of Hong Kong. 2 Department of Economics, University of Zurich. 3 Department of Economics, University of Colorado Boulder. 4 Department of Economic Research, Federal Reserve Bank of St. Louis. 1/37

Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

  • Upload
    others

  • View
    4

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Coauthorship Networks and

Research Output

Chih-Sheng Hsie1, Michael D. Konig2 Xiaodong Liu3 and ChristianZimmermann4

Faculty Lunch SeminarUniversity of Zurich

10th April 2017

1Department of Economics, Chinese University of Hong Kong.2Department of Economics, University of Zurich.3Department of Economics, University of Colorado Boulder.4Department of Economic Research, Federal Reserve Bank of St. Louis.

1/37

Page 2: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Research Projects

A. Innovation Networks and Technology Diffusion

A1. ”R&D Networks: Theory, Empirics and Policy Implications”A2. ”Network Formation with Local Complements and Global Substitutes”A3. ”Dynamic R&D Networks with Process and Product Innovations”A4. ”Endogenous Technology Cycles in Dynamic R&D Networks”

A5. ”Coauthorship Networks and Research Output”

B. Heterogeneous Firm Dynamics and Production Networks

B1. ”Peer Effects in FDI Networks and Technology Spillovers”B2. ”Endogenous Production Networks”B3. ”From Imitation to Innovation: Where is All That Chinese R&D

Going?”

C. Game Theoretic Models of Conflict in Networks

C1. ”Networks and Conflicts: Theory and Evidence from Rebel Groups inAfrica”

C2. ”Conflict Networks in Syria and Iraq”C3. ”The Arab Spring and the Spread of Revolutions in Online Social

Networks”

2/37

Page 3: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Contribution: Theory

◮ We build a micro-founded model for the output produced in scientificco-authorship networks.5

◮ Differently to previous works, we are able to characterize the interiorequilibrium when multiple agents spend effort in multiple, possiblyoverlapping projects, and there are interaction effects in the cost ofeffort.

◮ The equilibrium solution then allows us to study the impact ofindividual researchers on total research output (ranking), and theoptimal design of research funding programs.

5Coralio Ballester, Antoni Calvo-Armengol, and Yves Zenou. “Who’s Who in Networks.Wanted: The Key Player”. Econometrica 74.5 (2006), pp. 1403–1417; A. Cabrales,A. Calvo-Armengol, and Y. Zenou. “Social interactions and spillovers”. Games and EconomicBehavior (2010); Matthew O. Jackson and Asher Wolinsky. “A Strategic Model of Social andEconomic Networks”. Journal of Economic Theory 71.1 (1996), pp. 44–74.

3/37

Page 4: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Contribution: Empirics

◮ We develop a novel estimation framework in which agents canparticipate in many potentially overlapping projects, and thisparticipation is endogenously modelled.

◮ The allocation of agents into different projects is determined by amatching process that depends on both, the authors’ and projects’characteristics,6 while the effort levels are determined in the Nashequilibrium.

◮ We estimate this model using data for the network of scientificcoauthorships between economists registered in the Research Papers inEconomics (RePEc) author service.7

6Arun Chandrasekhar. “Econometrics of network formation”. In: The Oxford Handbook ofthe Economics of Networks. 2015, pp. 303–357; A. Chandrasekhar and M. Jackson. “Tractableand Consistent Random Graph Models”. Available at SSRN 2150428 (2012).

7http://repec.org/

4/37

Page 5: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Contribution: Policy

◮ We develop a novel ranking measure (for economists and theirdepartments) that quantifies the endogenous decline in research outputdue to the removal of an economist from the network (“key players”,“superstar” economists).8

◮ We find that the highest ranked authors are not necessarily the oneswith the largest number of citations, or coincide with other rankingmeasures used in the literature.

◮ However, this discrepancy is not surprising, as traditional rankings aretypically not derived from microeconomic foundations, and do not takeinto account the spillover effects generated in scientific knowledgeproduction networks.

8Yves Zenou. “Key Players”. Oxford Handbook on the Economics of Networks, Y. Bramoulle,B. Rogers and A. Galeotti (Eds.), Oxford University Press (2015); Fabian Waldinger. “Peereffects in science: evidence from the dismissal of scientists in Nazi Germany”. The Review ofEconomic Studies 79.2 (2012), pp. 838–861; Pierre Azoulay, Joshua Graff Zivin, andJialan Wang. “Superstar Extinction”. The Quarterly Journal of Economics 125.2 (2010),pp. 549–589.

5/37

Page 6: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ Our model further allows us to solve an optimal research fundingproblem of a planner who wants to maximize total scientific output byintroducing research grants into the author’s payoff function.9

◮ A comparison of our optimal funding policy with the National ScienceFoundation (NSF) indicates that there are significant differences, bothat the individual and the departmental levels (e.g. NSF awards areuncorrelated with the number of coauthors).

◮ Our results highlight the importance of the coauthorship network indetermining the optimal funding policy, while it does not seem to havean effect on the research funding program by the NSF.

9Paula E Stephan. “The economics of science”. Journal of Economic literature (1996),pp. 1199–1235; Paula E. Stephan. How economics shapes science. Harvard University Press,2012.

6/37

Page 7: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Production Function

◮ Assume that there are s ∈ P = {1, . . . , p} research projects (papers)and i ∈ N = {1, . . . , n} researchers (authors).

◮ Let the production function for project s be given by

Ys(G, es) =∑

i∈Ns

αieis +β

2

i∈Ns

j∈Ns\{i}

eisejs, (1)

◮ where eis ≥ 0 is the research effort of agent i in project s, Ns ⊆ N isthe set of agents participating in project s, es is the stacked vector ofauthors’ effort levels participating in project s,

◮ αi ≥ 0 is the ability/skill of researcher i and

◮ β is a spillover parameter from complementarities between the researchefforts of coauthors.

7/37

Page 8: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Payoffs

◮ The payoff of agent i is then given by

πi(G, e) =

p∑

s=1

Ys(G, es)δis −1

2

p∑

s,s′=1

φs,s′eiseis′δisδis′ (2)

where ns = |Ns| is the number of agents participating in project s, andwe have that ns =

∑n

i=1 δis with δis ∈ {0, 1} indicating whether i isparticipating in project s.

◮ This cost is convex if and only if the p× p matrix φ with elements φs,s′is positive definite.

◮ The case of a quadratic cost includes the case of a convex total cost asa special case when φs,s′ = γ, and the case of a convex separable costwhen φs,s′ = γδs,s′ .

8/37

Page 9: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Equilibrium Characterization

◮ Proposition Let the payoff function for each agent i = 1, . . . , n begiven by Equation (2), assume that

φss′ =

{

γ, if s′ = s,

ρ, otherwise.

and let Γ be the symmetric (n× p)× (n× p) matrix with elements

Γis,jk =

ρδisδjk, if i = j, s 6= k,

−βδisδjk, if i 6= j, s = k,

0, otherwise,

(3)

with zero diagonal for i, j = 1, . . . , n, and s, k = 1, . . . , p. Further let δbe an (n× p) stacked vector with elements δis, α an (n× p) stackedvector with elements αis = αiδis and e an (n× p) stacked vector ofagent-project effort levels, eis. Then if the matrix γ diag(δ) + Γ isinvertible, the equilibrium effort levels are given by

e = (γ diag(δ) + Γ)−1α. (4)

9/37

Page 10: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Line Graph

◮ Definition: Given a graph G, its line graph L(G) is a graph such thateach node of L(G) represents an edge of G, and two nodes of L(G) areadjacent if and only if their corresponding edges share a commonendpoint in G.10

◮ The matrix Γ represents a weighted matrix of the line graph L(G) ofthe bipartite collaboration network G, where

◮ each link between nodes sharing a project has weight −β, and

◮ each link between nodes sharing an author has weight ρ.

10Cf. e.g. Douglas B. West. Introduction to Graph Theory. 2nd. Prentice-Hall, 2001.

10/37

Page 11: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Example

1 2 3

1 2

e11 e21 e12 e321

(e11e12

)

2(e210

)

3

2

(0e32

)

1

2

Figure: (Left panel) The bipartite collaboration network G of authors andprojects, where round circles represent authors and squares represent projects.(Right panel) The projection of the bipartite network G on the set of coauthors.

11/37

Page 12: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

e11 e12 e32

e21

−β

−βρ

Figure: The line graph L(G) of Equation (3) associated with the collaborationnetwork G, in which each node represents the effort an author invests intodifferent projects. Solid lines indicate nodes sharing a project while dashed linesindicate nodes with the same author.

12/37

Page 13: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ The payoffs of the agents are given by

π1 = e11

(

α1 +β

2e21

)

+ e21

(

α2 +β

2e11

)

+ e12

(

α1 +β

2e32

)

+ e32

(

α3 +β

2e12

)

−γ

2e211 −

γ

2e212 − ρe11e12

π2 = e11

(

α1 +β

2e21

)

+ e21

(

α2 +β

2e11

)

−γ

2e221

π3 = e12

(

α1 +β

2e32

)

+ e32

(

α3 +β

2e12

)

−γ

2e232.

13/37

Page 14: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ The matrix Γ with elements Γis,jk from Equation (3) is given by

Γ =

0 ρ −β 0 0 0ρ 0 0 0 0 −β−β 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 −β 0 0 0 0

.

◮ From Equation (4) we have that (γ diag(δ) + Γ) e = α, that is

γ 0 0 0 0 00 γ 0 0 0 00 0 γ 0 0 00 0 0 0 0 00 0 0 0 0 00 0 0 0 0 γ

+

0 ρ −β 0 0 0ρ 0 0 0 0 −β−β 0 0 0 0 00 0 0 0 0 00 0 0 0 0 00 −β 0 0 0 0

e11e12e21e22e31e32

=

α1

α1

α2

00α3

.

14/37

Page 15: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ Computing the inverse, e = (γ diag(δ) + Γ)−1α, then gives

e11 =(β − γ)(β + γ)(α2β + α1γ) + γ(α3β + α1γ)ρ

γ2ρ2 − (β2 − γ2)2,

e12 =(β − γ)(β + γ)(α3β + α1γ) + γ(α2β + α1γ)ρ

γ2ρ2 − (β2 − γ2)2,

e21 =(β − γ)(β + γ)(α1β + α2γ) + β(α3β + α1γ)ρ+ α2γρ

2

γ2ρ2 − (β2 − γ2)2,

e32 =(β − γ)(β + γ)(α1β + α3γ) + β(α2β + α1γ)ρ+ α3γρ

2

γ2ρ2 − (β2 − γ2)2. (5)

15/37

Page 16: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

e12

e11

0.0 0.1 0.2 0.3 0.4 0.5

0.0

0.2

0.4

0.6

0.8

1.0

β

e11,e

12

e12

e12

0.0 0.1 0.2 0.3 0.4 0.5

0.0

0.2

0.4

0.6

0.8

1.0

β

e11,e

12

Figure: Equilibrium effort levels from Equation (5) for agent 1 with α1 = 0.2,α2 = 0.1, α3 = 0.9, ρ = 0.05 (left panel), ρ = 0.25 (right panel) and γ = 1. Thedashed lines indicate the effort level for β = 0.

16/37

Page 17: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Superstars, Key Players and Rankings

◮ We analyze the impact of the removal of individual authors from thecoauthorship network on overall scientific output.11

◮ The key author is defined by

i∗ ≡ argmaxi∈N

{p

s=1

Ys(G)−

p∑

s=1

Ys(G−i)

}

. (6)

◮ Further, aggregating researchers to their departments D ⊂ N allows usto compute the key department as

D∗ ≡ argmaxD⊂N

{p

s=1

Ys(G)−

p∑

s=1

Ys(G\D)

}

. (7)

11Fabian Waldinger. “Quality matters: The expulsion of professors and the consequences forPhD student outcomes in Nazi Germany”. Journal of Political Economy 118.4 (2010),pp. 787–831.

17/37

Page 18: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Research Funding

◮ We consider a two-stage game:

◮ In the first stage, the planner announces the research fundingscheme z ∈ R

n+ that the authors should receive, and

◮ in the second stage the authors choose their research efforts, givenz.

◮ The optimal funding profile z∗ can then be found by backwardinduction.

◮ Consider the second stage. We assume that agent i ∈ N receivesresearch funding, zi ≥ 0, proportional to the output she generates:

πi(G, e, z) =

p∑

s=1

Ys(G, es)δis−1

2

p∑

s,s′=1

φs,s′eiseis′δisδis′+

p∑

s=1

Ys(G, es)ziδis

︸ ︷︷ ︸

research funding

18/37

Page 19: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ In the special case that all researchers obtain the same funding peroutput produced, one can show that the planner has to solve thefollowing problem (first stage)

z∗ = argmaxz∈R≥0

(

p∑

s=1

Ys(G, z)

︸ ︷︷ ︸

total output

−n∑

i=1

p∑

s=1

δiszYs(G, z)

︸ ︷︷ ︸

total cost

), (8)

◮ where Ys(G, z) is the output of project s from Equation (1) with effortlevels given by Equation (4).

19/37

Page 20: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Data

◮ The data used for this study makes extensive use of the metadataassembled by the Research Papers in Economics (RePEc) authorservice initiative and its various projects.12

◮ RePEc assembles the information about publications relevant toeconomics from 1900 publishers, including all major commercialpublishers and university presses, policy institutions and pre-printsfrom academic institutions.

◮ At the time of this writing, this encompasses 2.2 million records,including 0.75 million for pre-prints.

◮ We take the publication profiles of economists registered with theRePEc Author Service (49,000 authors), that includes what they havepublished and where they are affiliated.13

12http://repec.org/13https://authors.repec.org/

20/37

Page 21: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ We get information about their advisors, students and alma mater, asrecorded in the RePEc Genealogy project.14

◮ We gather in which mailing lists the papers have been disseminatedthrough the NEP project.15 The latter have human editorsdetermining to which field new working papers belong.

◮ We make use of paper download data that is made available by theLogEc project.16

◮ We use citations to the papers and articles as extracted by the CitEcproject.17

◮ We use journal impact factors, author and institution rankings fromIDEAS.18

◮ We make use of the “Ethnea” tool at the University of Illinois toestablish the ethnicity of authors based on the first and last names.19

14https://genealogy.repec.org/15https://nep.repec.org/16http://logec.repec.org/17http://citec.repec.org/18https://ideas.repec.org/top/ . For a detailed description of the factors and rankings, see

Zimmermann (2013).19http://abel.lis.illinois.edu/cgi-bin/ethnea/search.py

21/37

Page 22: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Figure: The collaboration network among authors in the RePEc databaseconsidering only coauthored projects and dropping projects with zero impactfactor. A nodes’ size and shade indicates its degree.

22/37

Page 23: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Estimating the Production Function

◮ Suppose there are n authors and p papers. Following Equation (1), theproduction function of paper s, with s = 1, . . . , p, is given by

Ys =n∑

i=1

δi,sei,sαi +β

2

n∑

i=1

n∑

j=1,j 6=i

δi,sδj,sei,sej,s + ǫs, (9)

◮ where δi,s is an indicator variable such that δi,s = 1 if author iparticipates in paper s and δi,s = 0 otherwise, ei,s is the effort ofauthor i put into paper s,

◮ αi is a measure of the productivity of author i which may be modelledby λxi with xi denotes a (k × 1)-vector of author characteristics, and

◮ ǫs is a project-specific random shock.

◮ The econometrician observes Ys, xi, and δi,s, but not ei,s and ǫs.

23/37

Page 24: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Matching Process

◮ A problem with directly estimating Equation (9) is the potentialendogeneity of D.

◮ To address this endogeneity problem, we model the endogenousmatching process of author i to paper s by

δi,s = 1{ψi,s+ui,s>0},

where ψi,s denotes the matching quality between author i and paper sand ui,s is a random component.20

◮ We assume thatψi,s = c

′i,sγ1 + γ2µi + γ3κs,

◮ where ci,s denotes a h× 1 vector of dyad-specific regressors, which is tocapture the homophily effect between the pair of author i and paper s.

◮ The variable µi represents author i’s unobserved characteristic; and κsrepresents paper s’s unobserved characteristic.

20Arun Chandrasekhar. “Econometrics of network formation”. In: The Oxford Handbook ofthe Economics of Networks. 2015, pp. 303–357; A. Chandrasekhar and M. Jackson. “Tractableand Consistent Random Graph Models”. Available at SSRN 2150428 (2012).

24/37

Page 25: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ From the RePEc data we know the research field to reflect thesimilarity between authors and projects.

◮ We further treat the author in the project who ranks the highest inRePEc as the main author of the project, and construct the variableci,s between the main author s and author i according to theirsimilarities in

◮ gender,

◮ ethnicity,

◮ research fields, and

◮ whether they have an advisor-advisee relationship.

25/37

Page 26: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ Assuming ui,s is i.i.d. type-I extreme value distributed, we obtain alogistic regression model for the matching process21:

logit(δi,s) = c′i,sγ1 + γ2µi + γ3κs. (10)

◮ The production function of Eq. (9) is

Ys =

n∑

i=1

δi,sei,sαi+β

2

n∑

i=1

n∑

j=1,j 6=i

δi,sδj,sei,sej,s+κsη+vs, vs ∼ N (0, σ2v),

(11)where author’s productivity αi can be modelled by λxi + µiρ.

◮ The joint probability function of Y = (Y1, · · · , Yp) and D is

P (Y,D|x, c) =

µ

κ

P (Y|D,x, c, µ, κ)P (D|c, µ, κ)f(µ)f(κ)dµdκ,

(12)

from which we can estimate the parameters θ = (β, φ, γ, λ, η, ρ) withthe author and paper latent variables, µ = (µ1, · · · , µn) andκ = (κ1, · · · , κp).

21Bryan S. Graham. “Methods of Identification in Social Networks”. Annual Review ofEconomics 7.1 (2015), pp. 465–485.

26/37

Page 27: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Table: Estimation results.

Coauthored subsample Full sample

Model (I) Model (II) Model (I) Model (II)

Matching

Constant – -7.7896∗∗∗ – -9.1930∗∗∗

(0.1842) (0.1671)Same NEP – 0.5412∗∗∗ – 0.3494∗∗∗

(0.0562) (0.1175)Ethnicity – 1.1658∗∗∗ – 3.4429∗∗∗

(0.1441) (0.0857)Affiliation – 7.2436∗∗∗ – 7.0167∗∗∗

(0.4373) (0.4493)Gender – 0.0211 – 1.6455∗∗∗

(0.1737) (0.1482)Advisor-advisee – 7.5256∗∗∗ – 7.2347∗∗∗

(0.2462) (0.2126)Author effect – 1.0104∗∗∗ – 0.8884∗∗∗

(0.0801) (0.1991)Project effect – 0.0168 – -0.0046

(0.0818) (0.0504)

Sample size 773 1745

Note: Model (1) assumes an exogenous matching between authors andpapers, while Model (2) allows for an endogenous matching accordingto Equation (10).

27/37

Page 28: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ Similarity of gender, ethnicity, and particularly the affiliation of thelead researcher and coauthors make collaborations more likely.22

◮ Similarities in the NEP fields apositively and significantly affectcollaborations.23

◮ Being in a Ph.D. advisor–advisee relationship also largely contributesto collaborations.

◮ The author’s latent variable shows a positively significant effect on theauthor-project matching.

22Richard B Freeman and Wei Huang. “Collaborating with people like me: Ethniccoauthorship within the United States”. Journal of Labor Economics 33.S1 (2015), S289–S318.

23Lorenzo Ductor. “Does Co-authorship Lead to Higher Academic Productivity?” OxfordBulletin of Economics and Statistics (2014).

28/37

Page 29: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Table: Estimation results.

Coauthored subsample Full sample

Model (1) Model (2) Model (1) Model (2)

Output

β 0.6446∗∗∗ 0.6371∗∗∗ 0.6442∗∗∗ 0.6323∗∗∗

(0.0053) (0.0061) (0.0097) (0.0091)φ -0.0031 0.0032 -0.0018 -0.0107

(0.0037) (0.0052) (0.0056) (0.0060)Constant -1.4114∗∗∗ -0.9087∗∗∗ -1.3613∗∗∗ -1.3615∗∗∗

(0.2996) (0.1848) (0.4480) (0.2596)Log life-time citations 0.3838∗∗∗ 0.2174∗∗∗ 0.3738∗∗∗ 0.3283∗∗∗

(0.0541) (0.0403) (0.0789) (0.0565)Years after graduation -0.2334∗∗∗ -0.0881 -0.2051 -0.3804

(0.0870) (0.0768) (0.1414) (0.2650)Gender 0.3391∗∗ 0.2325 0.3428 0.2401

(0.1625) (0.1977) (0.2782) (0.2673)NBER connection 0.1942 0.9356∗∗∗ 0.2162 0.2850

(0.1202) (0.1480) (0.2070) (0.2314)Ivy League connection 0.2782∗∗∗ 0.0912 0.2271 0.3755∗∗

(0.1055) (0.1245) (0.1843) (0.1725)ρ – 0.8769∗∗∗ 1.3744∗∗∗

(0.0575) (0.0528)η – -1.0368 0.8087

(0.8525) (0.5770)

Sample size 773 1745

Note: Model (1) assumes an exogenous matching between authors and papers,while Model (2) allows for an endogenous matching according to Equation (10).

29/37

Page 30: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

◮ The spillover effect of efforts between coauthors (β), and thecongestion effect between projects (φ) have the expected signs,although only β is a significant predictor.

◮ Comparing Models (1) and (2), there exists an upward bias in thespillover effect β from assuming network exogeneity,

◮ Lifetime citations are a positive and significant predictor of researchoutput, and so is the female dummy (exogenous network case).

◮ Being affiliated with the NBER positively and significantly impactsresearch output.

◮ Having attended an Ivy League university also positively affects output(exogenous network case).

30/37

Page 31: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Rankings for Individuals

Table: Ranking of the top-twenty five researchers.

Rank Name Projects Citations RePEc Ranka Output Lossb Institute

1 Reinhart, Carmen M. 5 17907 18 -12.04% Kennedy School of Government, Harvard University2 Bloom, Nicholas 11 3668 220 -11.93% Department of Economics, Stanford University3 van Reenen, John Michael 12 5835 79 -11.90% London School of Economics4 Rogoff, Kenneth S 5 20195 9 -11.82% Department of Economics, Harvard University5 Falk, Armin 4 4611 329 -10.43% Center for Economics and Neuroscience, University of Bonn6 Dohmen, Thomas Johannes 4 1951 1145 -10.42% Department of Economics, University of Bonn7 Schmitt-Groh, Stephanie 6 3534 499 -10.08% Department of Economics, Columbia University8 Uribe, Marti 6 4401 363 -10.07% NBER, Columbia University9 Angrist, Joshua D 6 7982 55 -9.85% Economics Department, MIT10 Black, Sandra E. 5 2681 585 -9.77% Department of Economics, University of Texas-Austin11 Demirguc-Kunt, Asli 4 8820 107 -9.73% Economics Research, World Bank Group12 Clark, Andrew 5 4926 365 -9.66% Paris School of Economics13 Coibion, Olivier 6 680 1821 -9.62% Department of Economics, University of Texas-Austin14 Devereux, Paul J. 5 1575 1152 -9.58% School of Economics, University College Dublin15 Huizinga, Harry P. 3 2172 778 -9.57% School of Economics, Tilburg University16 Reis, Ricardo 2 2279 409 -9.49% London School of Economics17 Mankiw, N. Gregory 2 11931 42 -9.49% Department of Economics, Harvard University18 Acemoglu, Daron 4 16324 5 -9.46% Economics Department, MIT19 Senik, Claudia 6 803 2935 -9.41% Paris School of Economics20 Gorodnichenko, Yuriy 4 1740 518 -9.36% Department of Economics, University of California-Berkeley21 Pischke, Jorn-Steffen 5 2756 550 -9.32% London School of Economics22 Wolfers, Justin 4 2598 658 -9.32% Economics Department, University of Michigan23 Kahn, Matthew E. 4 2211 352 -9.29% Department of Economics, University of Southern California24 Armstrong, Mark 6 2191 591 -9.28% Department of Economics, Oxford University25 Diebold, Francis X. 4 10946 96 -9.26% Department of Economics, University of Pennsylvania

a The RePEc ranking is based on an aggregate of rankings by different criteria. See Zimmermann (2013) for more information.b The output loss for researcher i is computed as

∑p

s=1 Ys(G)−∑p

s=1 Ys(G−i). See also Equation (6).

31/37

Page 32: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Rankings for Departments

Table: Ranking of the top-ten departments.

Rank Department/School SizeRePEc OutputRanka Lossb

1 Department of Economics, Harvard University 10 1 -15.43%2 Kennedy School of Government, Harvard University 6 16 -12.84%3 Department of Economics, Stanford University 5 12 -12.54%4 Economics Department, Massachusetts Institute of Technology (MIT) 7 5 -11.69%5 National Bureau of Economic Research (NBER) 9 2 -11.57%6 Department of Economics, University of Texas-Austin 4 134 -11.52%7 Department of Economics, Princeton University 7 6 -10.84%8 Economics Research, World Bank Group 4 12 -10.83%9 Department of Economics, University of Pennsylvania 8 35 -10.58%10 Economics Department, University of Michigan 5 32 -10.35%

a The RePEc ranking is based on an aggregate of rankings by different criteria. See Zimmermann (2013) formore information.b The output loss for department D is computed as

∑p

s=1Ys(G)−

∑p

s=1Ys(G\D). See also Equation (7).

32/37

Page 33: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Research Funding

◮ We compare our optimal research funding scheme z∗ of Equation (8)using the parameter estimates with funding programs beingimplemented in the real world.24

◮ We use data on the funding amount, the receiving economicsdepartment and the principal investigators from the EconomicsProgram of the National Science Foundation (NSF) in the U.S.25

◮ The National Bureau of Economic Research (NBER) received thelargest amount of funds totalling to 95,058,724 U.S. dollars, followedby the University of Michigan with a total of 57,749,679 U.S. dollars.26

24Paula E. Stephan. How economics shapes science. Harvard University Press, 2012;Gianni De Frajay. “Optimal Public Funding for Research: A Theoretical Analysis”. RANDJournal of Economics 47.3 (2016), pp. 498–528.

25See https://www.nsf.gov/awardsearch/ .26L. Drutman. “How the NSF allocates billions of federal dollars to top universities”.

Sunlight foundation blog (2012).

33/37

Page 34: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Research Funding for Individuals

Table: Ranking of the optimal research funding for the top-twenty five researchers.a

Name d Citations RePEc Rankb Institution NSF [0/00] Funding [%]c Rank

Dirk Bergemann 2 961 1039 Economics Department, Yale University 0.9060 6.4289 1Stephen Morris 2 3233 269 Department of Economics, Princeton University 2.1517 4.5837 2Nicholas Bloom 1 3668 220 Department of Economics, Stanford University 2.9818 4.1262 3Eric Leeper 2 3096 429 Department of Economics, Indiana University 0.4063 3.4586 4Joshua Angrist 2 7982 55 Economics Department, Massachusetts Institute of Technology (MIT) 2.5971 2.8433 5Todd Walker 2 471 3164 Department of Economics, Indiana University 0.3002 2.6012 6George-Marios Angeletos 3 1530 903 Economics Department, Massachusetts Institute of Technology (MIT) 0.5126 2.4224 7Guido Lorenzoni 3 698 1644 Department of Economics, Northwestern University 0.4632 2.4224 8Alessandro Pavan 3 703 1873 Department of Economics, Northwestern University 1.2122 2.4224 9Daron Acemoglu 2 16324 5 Economics Department, Massachusetts Institute of Technology (MIT) 0.8070 2.4069 10Sandra Black 1 2681 585 Department of Economics, University of Texas-Austin 0.5879 2.2748 11Kenneth Rogoff 1 20195 9 Department of Economics, Harvard University 1.1896 1.8469 12Thomas Holmes 1 1066 1083 Department of Economics, University of Minnesota 0.9731 1.7696 13Francis Diebold 1 10946 96 Department of Economics, University of Pennsylvania 1.0423 1.5833 14Donald Andrews 1 7343 43 Economics Department, Yale University 2.3043 1.5768 15Glenn Ellison 2 3031 420 Economics Department, Massachusetts Institute of Technology (MIT) 0.9729 1.5453 16Randall Wright 1 3830 200 School of Business, University of Wisconsin-Madison 1.4782 1.4518 17Yuriy Gorodnichenko 1 1740 518 Department of Economics, University of California-Berkeley 0.8387 1.4079 18Geert Bekaert 2 7239 178 Graduate School of Business, Columbia University 0.2738 1.3214 19Robert Hodrick 2 3984 474 Graduate School of Business, Columbia University 0.2910 1.3214 20Mark Rosenzweig 1 5257 99 Economic Growth Center, Economics Department, Yale University 0.8702 1.2468 21Gita Gopinath 1 947 1930 Department of Economics, Harvard University 1.8603 1.2020 22Harald Uhlig 1 3326 241 Department of Economics, University of Chicago 1.1066 1.1769 23Mario Crucini 1 1185 1554 Department of Economics, Vanderbilt University 2.1582 1.1339 24Mikhail Golosov 1 1016 1371 Department of Economics, Princeton University 0.7983 1.0749 25

a We only consider researchers that are listed as principal investigators in the Economics Program of the National Science Foundation (NSF) in the U.S. from 1976 to2016 and that can be identified in the RePEc database.b The RePEc ranking is based on an aggregate of rankings by different criteria. See Zimmermann (2013) for more information.c The total cost of funds,

∑p

s=1 δiszYs(G, z), of researcher i with the optimal research funding scheme z∗ of Equation (8).

34/37

Page 35: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Research Funding for Departments

Table: Ranking of optimal research funding for the top-ten departments.a

Department/School Size NSF [0/00] Funding [%]b Rank

Graduate School of Business, Columbia University 4 23.9521 10.2990 1Sloan School of Management, MIT 4 14.3698 8.4177 2Department of Economics, University of Texas-Austin 4 2.7044 6.4060 3Department of Economics, University of Minnesota 7 42.2463 6.3126 4Department of Economics, Vanderbilt University 5 4.8587 4.7536 5Economics Department, Yale University 4 22.7299 4.2365 6Economics Department, University of Rochester 4 5.3735 2.6427 7Department of Economics, Northwestern University 7 37.5792 2.5090 8Department of Economics, University of Chicago 9 20.5289 2.2148 9Economics Department, Georgetown University 7 2.2544 2.1254 10

a We only consider researchers that are listed as principal investigators in the Economics Programof the National Science Foundation (NSF) in the U.S. from 1976 to 2016 and that can be identifiedin the RePEc database.b The total cost of funds,

∑i∈D

∑p

s=1δisz

∗Ys(G, z), for each department D and researchers i ∈ Dwith the optimal research funding scheme z∗ of Equation (8).

35/37

Page 36: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Deg.

Citations

NSF

Funding

Correlation Matrix

0.13

0.130.02

-0.01

0.02-0.01

0.290.45

0.22

0.290.22

0.45

Deg. Citations NSFFunding

−1 0 1 2 3 2 4 6 8 10 12−4 −2 0 2 −4 −2 0 2

−1

0

1

2

3

2

4

6

8

10

12

−4

−2

0

2

−4

−2

0

2

Figure: Pair correlation plot of the authors’ degrees, citations, total NSF awardsand the optimal funding policy. The Spearman correlation coefficients are shownfor each scatter plot.

36/37

Page 37: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Conclusion

◮ We have analyzed the equilibrium efforts of authors involved inmultiple, possibly overlapping projects.

◮ We bring our model to the data by analyzing the network of scientificcoauthorships between economists registered in the RePEc authorservice.

◮ We rank the authors and their departments according to theircontribution to aggregate research output, and thus provide the firstranking measure that is based on microeconomic foundations.

◮ Moreover, we analyze various funding instruments for individualresearchers as well as their departments.

◮ We show that, because current research funding schemes do not takeinto account the availability of coauthorship network data, they areill-designed to take advantage of the spillover effects generated inscientific knowledge production networks.

37/37

Page 38: Coauthorship Networks and Research OutputThe matrix Γ represents a weighted matrix of the line graph L(G) of the bipartite collaboration network G, where each link between nodes sharing

Future Work

◮ We can allow the returns of an author from participating in a projectto be split equally among the participants.27

◮ Instead of a convex cost, we can introduce a time constraint.28

◮ In work in progress we are extending our analysis to

◮ the Framework Programs of the E.U. and

◮ the research funding program of the Swiss National ScienceFoundation.

27Eugene Kandel and Edward P Lazear. “Peer pressure and partnerships”. Journal ofpolitical Economy (1992), pp. 801–817; Matthew O. Jackson and Asher Wolinsky. “A StrategicModel of Social and Economic Networks”. Journal of Economic Theory 71.1 (1996), pp. 44–74.

28Leonie Baumann. “Time allocation in friendship networks”. Available at SSRN 2533533(2014); Hannu Salonen. “Equilibria and Centrality in Link Formation Games”. Mimeo,Department of Economics, University of Turku, 20014 Turku, Finland (2014).

38/37