22
3 Computational thinking and the pedagogy of network science Aaron Clauset @aaronclauset Computer Science Dept. & BioFrontiers Institute University of Colorado, Boulder External Faculty, Santa Fe Institute 20 June 2017 ©

Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

HVR 6

100150

200250

300

Computational thinking and the pedagogy of network science

Aaron Clauset@aaronclausetComputer Science Dept. & BioFrontiers InstituteUniversity of Colorado, BoulderExternal Faculty, Santa Fe Institute

20 June 2017 ©

Page 2: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

who are network scientists?

Physicists

Computer Scientists

Applied Mathematicians

Statisticians

Biologists

Ecologists

Sociologists

Political Scientists

it’s a big community!}

Page 3: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

who are network scientists?

Physicists

Computer Scientists

Applied Mathematicians

Statisticians

Biologists

Ecologists

Sociologists

Political Scientists

it’s a big community!

• different traditions

• different tools

• different questions

}

Page 4: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

who are network scientists?

Physicists

Computer Scientists

Applied Mathematicians

Statisticians

Biologists

Ecologists

Sociologists

Political Scientists

it’s a big community!

• different traditions

• different tools

• different questions

increasingly, not ONE community, but MANY, only loosely interacting communities

}

Page 5: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

who are network scientists?

Physicists

Computer Scientists

Applied Mathematicians

Statisticians

Biologists

Ecologists

Sociologists

Political Scientists

phase transitions, universality

data / algorithm oriented, predictions

dynamical systems, diff. eq.

inference, consistency, covariates

experiments, causality, molecules

observation, experiments, species

individuals, differences, causality

rationality, influence, conflict

}

Page 6: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

how do we teach networks?

what are core concepts?

what are core tools?

these vary by field!

Page 7: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

how do we teach networks?

what are core concepts?

what are core tools?

these vary by field!

for students:

• what topics should be covered?

• how should they be taught?

• how should evaluations be structured?

• what is introductory vs. what is advanced?

Page 8: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

http://santafe.edu/~aaronc/courses/5352/

Network Analysis and Modeling

Instructor: Aaron Clauset

This graduate-level course will examine modern techniques for analyzing and modeling the structure and dynamics of complex networks. The focus will be on statistical algorithms and methods, and both lectures and assignments will emphasize model interpretability and understanding the processes that generate real data. Applications will be drawn from computational biology and computational social science. No biological or social science training is required. (Note: this is not a scientific computing course, but there will be plenty of computing for science.)

Full lectures notes online (~150 pages in PDF)

how do we teach networks?

Page 9: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

how do we teach networks?

http://santafe.edu/~aaronc/courses/5352/

Network Analysis and Modeling

Instructor: Aaron Clauset

This graduate-level course will examine modern techniques for analyzing and modeling the structure and dynamics of complex networks. The focus will be on statistical algorithms and methods, and both lectures and assignments will emphasize model interpretability and understanding the processes that generate real data. Applications will be drawn from computational biology and computational social science. No biological or social science training is required. (Note: this is not a scientific computing course, but there will be plenty of computing for science.)

Full lectures notes online (~150 pages in PDF)

• graduate level Computer Science course

• 4 instances since 2013

• roughly 140 students (mostly CS, some APPM, PHYS, etc.)

• regular lectures

• 6 problem sets + 1 team project (of their choosing; team = 2-3 students)

• 150 pages of textbook style lecture notes

• textbooks:

Page 10: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

how do we teach networks?

Network Analysis and Modeling

learning goals:

1. develop a network intuition for reasoning about how structural patterns are related, and how they influence dynamics in / on networks

2. master basic terminology and concepts

3. master practical tools for analyzing / modeling structure of network data

4. build familiarity with advanced techniques for exploring / testing hypotheses about networks

Page 11: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

how do we teach networks?Network Analysis and Modeling

current format (lectures):

1. network basics2. centrality measures3. random graphs (simple)4. configuration model5. large-scale structure (communities, hierarchies, etc.)6. probabilistic generative models (SBMs)7. metadata, label and link prediction8. spreading processes (social, biological, SI-type)9. data wrangling + data sampling (artifacts)10. role of statistics in hypothesis generation / testing11. spatial networks12. citations networks, dynamics, preferential attachment13. temporal networks14. student project presentations

building intuitionbasic concepts, toolspractical toolsadvanced tools

Page 12: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

network data for assignments

Page 13: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

lessons learned

what’s difficult:

1. students need to know many different things:

2. can’t teach all of these things to all types of students!

• vast amounts of advanced material in each of these directions

• students have little experience / intuition of what makes good science

• some probability Erdos-Renyi, configuration, calculations• some mathematics physics-style calculations, phase transitions• some statistics basic data analysis, correlations, distributions• some machine learning prediction, likelihoods, features, estimation algorithms• some programming data wrangling, coding up measures and algorithms

Page 14: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

lessons learned

what works well:

1. simple mathematical problemsbuild intuition + practice with concepts

nA nB

A

B

calculate the diameter

closeness centrality

modularity of a line graph

n− rr

betweenness of

Q(r)

A

A

Page 15: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

lessons learned

what works well:

2. analyze real networkstest understanding + practice with implementing methods

102

103

104

105

2

2.5

3

3.5

Network size, n

Mean g

eodesi

c path

length

USF

Haverford

Caltech

Penn

mean geodesics and O(log n)1 4 7 10 13 16 19 22 25 28 31 34

0.35

0.4

0.45

0.5

0.55

0.6

0.65

0.7

0.75

vertex label

harm

onic

centr

alit

y

Karate clubconfiguration modelreal-world network

node centrality vs. configuration model(when is a pattern interesting?) Assortativity (gender)

-0.1 -0.05 0 0.05 0.1

Den

sity

×10-3

0

0.2

0.4

0.6

0.8

1

1.2

1.4

1.6

1.8

2

attribute assortativity

Page 16: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

lessons learned

what works well:

3. simple prediction taskstest intuition + run numerical experiments

Fraction of labels observed, f0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Frac

tion

of c

orre

ct la

bel p

redi

ctio

ns

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1malaria genes, HVR5Norwegian boards, net1m-2011-08-01

Fraction of edges observed, f0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

AUC

0.5

0.6

0.7

0.8

0.9

1HVR5 malaria genes network

degree productJaccard coefficientshortest pathbaseline (guessing)

label prediction via homophily link prediction via heuristic

Page 17: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

in-degree, kin

100 101 102 103 104 105

Pr(K

≥ k

in)

10-6

10-5

10-4

10-3

10-2

10-1

100

r=1r=4no preferential attachment

015

5

1

l

10

10

cin-cout p

15

0.550 0

lessons learned

what works well:

4. simple simulationsexplore dynamics vs. structure + numerical experiments

simulate epidemics (SIR) on planted partitions simulate Price’s model

Page 18: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

lessons learned

what works well:

5. team projectsteamwork + exploring their own ideas

Page 19: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

key takeaways

0

0.51

a(t)

0

1

0

1

0200

400600

0

1

alignment position t

1

23 4

56

78

9

calculate alignment scoresconvert to alignment indicatorsremove short aligned regionsextract highly variable regions

NGDYKEKVSNNLRAIFNKIYENLNDPKLKKHYQKDAPNY

NGDYKKKVSNNLKTIFKKIYDALKDTVKETYKDDPNY

NGDYKEKVSNNLRAIFKKIYDALEDTVKETYKDDPNY

16

6

13

16 6

13

A

B

C

D

Page 20: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

• network intuition is hard to develop!

good intuition draws on many skills (probability, statistics, computation, causal dynamics, etc.)

• best results for students seem to come from1. exercises to get practice with calculations2. practice analyzing diverse real-world networks3. conducting out numerical experiments & simulations

• students loved practical tasks (link and label prediction very popular)

• centrality measures seem useless… (why do we teach them?)

• configuration model as go-to null model for checking if pattern is "interesting"

• idea: teach concepts using link prediction as common theme?

key takeaways

0

0.51

a(t)

0

1

0

1

0200

400600

0

1

alignment position t

1

23 4

56

78

9

calculate alignment scoresconvert to alignment indicatorsremove short aligned regionsextract highly variable regions

NGDYKEKVSNNLRAIFNKIYENLNDPKLKKHYQKDAPNY

NGDYKKKVSNNLKTIFKKIYDALKDTVKETYKDDPNY

NGDYKEKVSNNLRAIFKKIYDALEDTVKETYKDDPNY

16

6

13

16 6

13

A

B

C

D

Page 21: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

thanks

http://santafe.edu/~aaronc/courses/5352/

Network Analysis and Modeling

Instructor: Aaron Clauset

This graduate-level course will examine modern techniques for analyzing and modeling the structure and dynamics of complex networks. The focus will be on statistical algorithms and methods, and both lectures and assignments will emphasize model interpretability and understanding the processes that generate real data. Applications will be drawn from computational biology and computational social science. No biological or social science training is required. (Note: this is not a scientific computing course, but there will be plenty of computing for science.)

Full lectures notes online (~150 pages in PDF)

Page 22: Computational thinking and the pedagogy of network sciencetuvalu.santafe.edu/~aaronc/slides/Clauset_2017... · 2017-06-20 · Computational thinking and the pedagogy of network science

fin

0

0.51

a(t)

0

1

0

1

0200

400600

0

1

alignment position t

1

23 4

56

78

9

calculate alignment scoresconvert to alignment indicatorsremove short aligned regionsextract highly variable regions

NGDYKEKVSNNLRAIFNKIYENLNDPKLKKHYQKDAPNY

NGDYKKKVSNNLKTIFKKIYDALKDTVKETYKDDPNY

NGDYKEKVSNNLRAIFKKIYDALEDTVKETYKDDPNY

16

6

13

16 6

13

A

B

C

D

network science