Corporate Technology
So昀�ware Intelligence with
Understanding the Linux Ecosystem
Dr. Wolfgang Mauerer, M. Joblin, Dr. J. Ebke,Dr. M. Meilinger, Dr. A. Eckert, S. Weber, Prof. Dr. S. Apel,C. Rubner, D. Srivastav
Siemens AG, Corporate Research and TechnologiesCorporate Competence Centre Embedded [email protected]
Page 1 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Overview
What’s Open Source So昀�ware about?
Publishing codeBuilding communities → Network effects!
So there are network effects…BUT:
Different types?Good and bad approaches?Impact of technology?Better so昀�ware?
Page 2 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Overview
What’s Open Source So昀�ware about?
Publishing codeBuilding communities → Network effects!
So there are network effects…BUT:
Different types?Good and bad approaches?Impact of technology?Better so昀�ware?
Page 2 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Overview II
Why?
Learn about embedded so昀�ware ecosystemFind proper involvementProject controllingLearn from masters!
Page 3 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Two Problems of So昀�ware Development
吀�e two problemsof so昀�ware development
Page 4 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Problem 1: It’s about Technology
Page 5 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Problem 2: It’s about People
Page 6 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Project: So昀�ware Intelligence
Pharmaceuticals
✓ A-priori understanding✓ Tests & statistics
Image source: fraxinusit.comPage 7 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Project: So昀�ware Intelligence
Pharmaceuticals
✓ A-priori understanding✓ Tests & statistics
So昀�ware
✗ Comparative experiments✗ Quantify people and behaviour✗ Personal experience limited
Image source: fraxinusit.comPage 7 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Introducing Codeface
Page 8 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Codeface: Two Aspects
(Fundamental) Research
𝑃 u� u�(𝑘)
=𝑃 u�
(𝑘)∀𝑅 u�
∈ ℛ
𝑃 ℛ(𝑄
) =|{𝑖
∈ [1,|ℛ
|] |𝑞 u� u�
(𝒞) =
𝑄}|
|ℛ|
𝐻 0∶ 𝑃 ℛ
(𝑄=
𝑞 u�(𝒞
))>
𝜖
Practical So昀�ware
Page 9 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Project Goals
Learn from SW Devel Data
Objective properties ofsuccessful projectsUnderstand network effectsQuantify social factorsMine large bodies of so昀�ware(calibration: open source)
Find Actionable Insights
Find most efficient approachesfor a given scenarioAssess ongoing developmentMake informed choices
-ENOINTENTIONDon’t impose mandatory interpretations!
Page 10 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Project Goals
Learn from SW Devel Data
Objective properties ofsuccessful projectsUnderstand network effectsQuantify social factorsMine large bodies of so昀�ware(calibration: open source)
Find Actionable Insights
Find most efficient approachesfor a given scenarioAssess ongoing developmentMake informed choices
-ENOINTENTIONDon’t impose mandatory interpretations!
Page 10 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Data Sources
Data Sources
Page 11 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Data Sources
Page 12 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Data Sources
So昀�wareIntelligence
RevisionControl
Collaboration
Growth,Dynamics
CodeComplexity
Growth,DynamicsMailing Lists
Collaboration
Topics, Focus
Bug Tracking
Magnitudes
Dynamics
Page 12 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Examples
Examples
Page 13 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Example – Time Series
Page 14 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Example – Time Series
0
100
200
300
400
500
20
40
60
0
2500000
5000000
7500000
Averaged (sm
all window
)A
veraged (large window
)C
umulative
2010 2011 2012 2013Time
Am
ount
of c
hang
es
"blue"
blue
Code changes for project 'Linux kernel'
Page 14 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Example – Time Series
0
5000
10000
15000
0
20
40
60
80
0e+00
2e+05
4e+05
6e+05
Averaged (sm
all window
)A
veraged (large window
)C
umulative
2006 2008 2010 2012Time
Am
ount
of c
hang
es
"blue"
blue
Code changes for project 'Ruby on Rails'
Page 14 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Time Series II
Interpretation
No objectively „optimal“approachSelf-Consistency matters!
Page 15 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Time Series II
Interpretation
No objectively „optimal“approachSelf-Consistency matters!
Page 15 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Example – Communication: Paths & Topics
Page 16 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Example – Communication: Paths & Topics
John McCall
Tobias Grosser
Robert Ankeney
Joel Salomon
Eli Friedman
Jean Daniel Dupas
Nicola Gigante
Sebastian Redl
Richard Smith
Cristiano Giuffrida
Matthew Abbott
Florian Pflug
Howard Hinnant
Douglas Gregor
David Chisnall
Nick Lewycky
Devang Patel
Hal Finkel
Bill Wendling
Chandler Carruth
James Molloy
Dan Gohman
zong y5 9Onoh4P yGk
David Blaikie
Anna Zaks
Zong
Ted Kremenek
Sandeep Patel
Matthieu Monrocq
Garrison Venn
Eric Christopher
alexey kutumov Re5JQEeQqe8AvxtiuMwx3w
kavyass ReJQEeQqe8AvxtiuMwx3wSven Verdoolaege
Guy Benyei
forumer WCoZDWwJFNSE5yrzoHCJg
Gregory Szorc
Nico Weber
Chris Lattner
r4start Re5JQEeQqe8AvxtiuMwx3w
express
clangmetadata
analyz
warn
chang
examplllvm list
mean
fix
oper
eli
error
valu
modul
inform
pass
col
flag
option
support
ast
name
current
bug
creat
pars
help
requir
test
timegcc
object
attribut
implement
except
fail
result
directmain
touch
plugin
bit
generat
behavior issu
header
virtual
messag
Page 16 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Example – Communication: Paths & Topics
Mathieu CLAVEL
Torsten utf q B C3 B6gershausen
Carlos ISO 59 1 Q Mart EDn Nieto
Nguyen Thai Ngoc Duy
Luke Diamand
Vitor Antunes
Dieter Plaetinck
Alex Riesen
Thomas Rast
Pete WyckoffJunio C Hamano Johannes Sixt
andreas koenig os6VVqR
Sitaram Chamarty
Jonathan Nieder
David Brown
Shawn Pearce
David Aguilar
Jeff King
Jens Lehmann
avarab
Jakub Narebski
Erik Faye Lund
Yves Goergen
Jason WengerFelipe Contreras
John Keeping
Nathan Bullock
git
commit
branch
test
script
error
merg
push
ref
repo
fix
parent
rebas
user
remot
clone
list
miss
status
chang
run
command
name
creat
Page 16 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Communication Paths & Topics
It’s not the content…
Topics: No surprisesHard to evaluate by machinesBut: Overall picture meaningful
Page 17 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Examples – Communication Paths & Topics
●●●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●●
●●
●●
● ●●● ●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
1
10
1
10
subjectcontent
1 10Messages initiated (log. scale)
Res
pons
es (
log.
sca
le)
deg●
●
●
●
●
0.00
0.25
0.50
0.75
1.00
col
●
●
High deg
Low deg
clang
●
●
●●
●
●
●●●●●
● ●
●●●●
●
●
●
●
●
● ●
●
●
●
●
● ●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
● ●
●
●
●
●
●
●●
●●
●
●●● ●●●
●
●● ●●●●●
●
●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●●●
●●●●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
1
10
100
1
10
100
subjectcontent
1 10Messages initiated (log. scale)
Res
pons
es (
log.
sca
le)
deg●
●
●
●
●
0.00
0.25
0.50
0.75
1.00
col
●
●
High deg
Low deg
git
Content does not matter—subjects suffice!
Page 18 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Examples – Communication Paths & Topics
●●●●●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
● ● ●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●● ●●
●
●
●
●
●●
●
●
●
●
●●●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●●
●
●●
●●
●●
● ●●● ●
●
●
●
●●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●●
● ●
●
●
●
●●
●
● ●
●
●
●
●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
1
10
1
10
subjectcontent
1 10Messages initiated (log. scale)
Res
pons
es (
log.
sca
le)
deg●
●
●
●
●
0.00
0.25
0.50
0.75
1.00
col
●
●
High deg
Low deg
clang
●
●
●●
●
●
●●●●●
● ●
●●●●
●
●
●
●
●
● ●
●
●
●
●
● ●●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
● ●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●●
●
●
●●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
● ●
●
●
●
●
● ●
●
●
●
●
●
●●
●●
●
●●● ●●●
●
●● ●●●●●
●
●
●●
●●●●
●
●
●●
●
●
●
●
●
●
●
●
●
●
● ●
●
●●
●
●●●●
●●●●
●
●
●
● ●
●
●
●
●
●
●
●●
●
●
●●
●
●
●
●
● ●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●●
●
●
●
●●
●
●
●
●
●
●
●●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●
●●
●
●
●●
●
●
●●
●
●
●
●●
●
●
●
●
●●
●
●●●
●
●
●
●
●
●
●
●
●
●
●
1
10
100
1
10
100
subjectcontent
1 10Messages initiated (log. scale)
Res
pons
es (
log.
sca
le)
deg●
●
●
●
●
0.00
0.25
0.50
0.75
1.00
col
●
●
High deg
Low deg
git
Content does not matter—subjects suffice!
Page 18 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Communication Paths & Topics II
Page 19 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Communication Paths & Topics II
Page 19 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Examples – Cooperation and Teams
Page 20 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Examples – Cooperation and Teams
1
32
86
93
99
118
142
144
195
308
317
339
340
378
380
389
391
395
400
570
644
658
703
870
919
920
921
1013
1014
1015
1082
1201
1359
1360
2
10
49
147
450
600
775
1110
1398
3
12
13
14
15
16
21
25
26
27
71
83
87
88
89
90
148
158
177
183
311
346
372
399
407
599
706
826
976
1018
4
5
6
602
785
915
1025
1026
7
8
19
45
61
65
113
140
141
268
276
332
334
440
855
860
862
864
878
885
952
1338
9
323
461
465
689
776
11
24
28
40
41
42
53
73
75
80
85
98
108
116
117
132
143
176
179
196
199
227
256
264
291
304
402
410
434
438
441
475
508
519
531
633
646
683
690
743
763
769
773807
838
880
881
883
893
914
922
950
974
986
1005
1019
10801086
1087
1088
1089
1091
1092
1093
1094
1096
1097
1102
1103
1104
1106
1107
1108
1109
1111
1112
1113
1124
11281130
1132
1133
1135
1136
1138
11391140
1141
1142
1143
1144
1145
1146
1147
1216
1253
1278
1279
1280
1307
1308
1313
1345
1349
1369
1401
1403
1405
1188
502
17
205
326
526
529
530
532
533
637
748
925
1452
18
109
20
31
46
58
68
94
119
263
337
524
545
572
577
593
594
595
606
610
611
648
650
674
678
701
798
856
900
901
1041
1069
1070
1156
1215
1241
1289
1311
1438
22
469
23
30
152
312
313
316
808
873
1382
33
37
39
74
82
104
167
222
253
330
341
363
375
386
425
521
563
642
656
675
698
929
948
1003
1071
1159
1166
1171
1184
1186
1204
1286
1296
1301
1312
1339
1355
1385
1430
1444
106
29
34
157184
238
240
251
290
309
318
345
348
376
396
398
401
403
404
523
581
705
722
762
812
1198
1199
1239
1302
1358
1377
295
421
424
428
442
459
790
1195
1282
329
597
799
585
35
36
265
481
641
1049
1205
38
50
51
77
78
362
677
1029
1160 1161
1263
1412
47
48
54
55
57
59
164
198
200
361
392
528
560
598
645
665
671
692
765
766
800
806
872
884
963
168
135
139
464
43
91
44
522
1462
492
634
869
52
156
353
368
369
370
494
543
547
590
591
672
972
1023
1046
1047
1115
1167
11911196
1197
1259
1326
1331
1378
1409
1425
56
435
436
437
439
444
445
446
460
586
587
588
589
1233
1415
60
63
145
394
540
557
558
338
468
541
542
660
797
876
1017
1068
1170
1269
62
324
327
328
406
1072
64
66
67188 189
342
409
411
886
918
379
381
643
69
708
70
96
97
131
236
279
294
397
472
473
474
478
479
490
503
506
567
568
661
691
693
694
707
709
755
756 770
772
1040
1218
462
539
72
76
673
686
79
185
239
241
525
552
688
715
912
994
1332
1333
1426
81
114
121
149
151
181
182
201
202
203
234
243
244
245
246
247
252
277
284
285
286
289
296
297
298
299
300
301
302303
331
349
351
357
373
382
383
384
412
414
415
416
417
418
419
420
429
430
431
432
433
456
495
515
518
535
550
551
554
578
579
580
614
615
620
623
624
630632
679
680
685
699
714
749
803
887
888
890
891
935
936
944
1001
1004
1006
1034
1035
1036
1081
1154
1179
1202
1266
1267
1272
1275
1276
1303
1306 1321
1340
1344
1373
1380
1392
13931396
1406
1407
1431
1433
1434 1435
1439
1449
255
1039
84
467
470
110
171
172
953
1045
1270
269
92
95
750
758
760
764
1390
320
480
493
605
607
609
751
752
754
1300
1408
292
293
507
604
100
101
102
103
105
107
166
204
206
207
208
209
210
211
212
213
215
217
221
223
224
226
229
230
231
232 233
235
271
273
274
307
471
512
514
553
564
583
638
639
640
676
771
892
894
895
8961274
1329
1371
1372
1397
112
115
120
122
123
124
125
126
127
129
130
134
136
138
153
154
155
160
162
163
165
173
258
260
261
266
270
275
305
344
356
358
453
454
455
457
458
482
483
484
485
487
489
496
505
517
536
573
584
601
612
613
616
618
619
635
636
655
657
681
682
761
767
831
1002
11211223
1225
1227
1245
1246
1247
1248
1249
1250
1251
1252
1265
1314
1315
1318
13201323
1335
1364
1367
1368
111
405
574
575
576
592
653
697
509
987988
989
990
966
968
128
133
137
865
1410
343
146
150
559
947
159
161
390
393
659
687
719
951
169
170
180
174
175
1073
178
310
350
664
669783
810
1000
1016
1234
1235
1236
1260
1294
1309
1287
897
1119
1126
1295
249
744
1116
1122
186
187
190
191
192
193
194
197
511
214
216 218
219
510
652220
225
449
561
228
237
242
248
250
254
1264
1365
321
336
965
257
259
451
262
477
704962
476
995
267
335
537
1174
1175
272
278
280
281
283
288
625
626
628
629
801
802
1362
282
729
287
306
314
315
319
978
322
325
333
571
779
780
782
840
857
859
861
917
973
1043
1044
1083
1429
347
359
408
546
548
596
845
945
1024
352
354
355
898
939
1356
940
360
364
365
366
367
710
711
712
713
1256
1354
1357
805
949
371
374
377
385
1008
1009
1010
1011
1012
1042
1076
1090
1291
1292 1293
387
388
1416
413
422
423
1178
1457
426
427
565
969
443
447
448
562
662
452
1342
463
1048
1051
1052
1055
1056
903
1054
1057
1290
466
603
497
486
488
491
498
501
1217
1305
1361
499
500
504
513
516
520
902
904
906
907
908909
910
1363
1451
527
534
924
927
1281
538
1079
544
549
555
556
566
569
582
991
1240
1221
608
617
621
622
627
631
647
649
651
654
720
723
725
726
728
730
731
733
734
735
736
737
738
739
741742
746
747
777
778
792
793794
795
796
874
875
923
1177
1352
696
996
997
1327
663
702971
1334
666
667668
1374
1375
670
717
889
932
941
1325
1442
1443
787
789
1053
1164
1168
1169
684
695899
700
821
716
718
721
724
727
732
740
745
753
757
759
1182
768
979980
982
984
985
774
1149
781
784
786
954
955
956
960
993
1383
1384
788
814
815
816
817
818
819
820
822
825
828
829
830
832
833
834
835
836
837
839
841
842
843
847
849
851
866
1206
1210
1211
1214 1304
1417
1418
1421
1422791
804
809
811
813823
824
1074
1075
827
844
846
848
850
852
853
854
916
964
858
863
867
868
871
877
879
882
1183
1185
1194
1230
1328
905
911
913
928
999
926
930
931
933
934
937
938
942
943
946
1190
1441
957
958
959
961
967
1254
1337
970
1432
975
977
981
983
992
998
1007
1020
1021
1022
1027
1028
1030
1031
1032
1033
1037
1038
1050
1058
1059
1060
1062
1063
1064
1065
1066
1061
1067
1077
1078
1084
1085
1095
1098
1099
1100
1101
1105
1114
1117
1118
1120
1123
1125
1127
1129
1131
1134
1137
1148
1150
1151
1152
1153
1155
1157
1158
1162
1163
1165
1172
11731376
1176
1180
1181
1189
1423
1187
1192
1193
1200
1277
1458
1460
1461
1463
1203
1207
1208
1209
1212
1213
1219
1220
1222
1224
1226
1228
1229
1231
1232
1237
1238
1242
1243
1244
1255
1257
1258
1261
1262
1268
1271
1273
1283
1284
1285
1288
1297
1298
1299
1310
13161317
1319
1322
1324
1330
1336
1341
1343
1346
1347
1348
1350
1351
1353
1366
1370
1379
1381
1386
1387
1388
1389
1391
1394
1395
1399
1400
1402
1404
1411
1413
1414
1419
1420
1424
1427
1428
1436
1437
1440
1445
1446
1447
1448
1450
1453
1454
1455
1456
1459
Page 20 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Examples – Cooperation and Teams
Page 20 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Cooperation I
QEMU: Current State
Influence of individual contributorsGroup structure
Page 21 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Cooperation I
QEMU: Early Days
Influence of individual contributorsGroup structure
Page 21 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Cooperation II
How to…
Construct network?Determine contributor centrality?Identify (meaningful) subgroups?
Page 22 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Cooperation III
Network construction
Tagging (“Signed-off-by”)Committer-AuthorOverlapping contributions
Contributor CentralityGoogle’s Page Rank
Community Decomposition
Spin Glass approach
Image source: cdn.oliveandcocoa.com/
Page 23 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Cooperation III
Network construction
Tagging (“Signed-off-by”)Committer-AuthorOverlapping contributions
Contributor CentralityGoogle’s Page Rank
Community Decomposition
Spin Glass approachVertices (developers) have spinstate ∈ [0, 𝑐]Edges (connections):Preferences to spin statesApproximate lowest energystateMulti-Method support:OSLOM, Random Walks
Image source: cdn.oliveandcocoa.com/
Page 23 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
It’s about People, part 2: Trust
Page 24 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Community Validation: Expert Knowledge
Avi Kivity
Gleb Natapov
Takuya Yoshikawa
Mathias Krause
Jinsong Liu
Raghavendra K T
Christoffer Dall
Christian Borntraeger
Michael S. Tsirkin
Marcelo Tosatti
Xiao Guangrong
Cornelia Huck
Jan Kiszka
Grant Grundler
Alexander Graf
Guo Chao
Petr Matousek
Stefan Fritsch
Christian Hildner
Eric B Munson
Dong Hao
Runzhen Wang
Chegu Vinod
Page 25 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Microso昀� is taking over the Linux kernel!
Page 26 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Microso昀� is taking over the Linux kernel!
Top Linux 3.0 Contributors (# Commits)
K. Y. Srinivasan 343 3.8%David S. Miller 176 2.0%Dan Williams 149 1.7%
Jonathan Cameron 119 1.3%Takashi Iwai 108 1.2%Mark Brown 91 1.0%
Surely people ≠ 𝑓(𝑥1, 𝑥2, … , 𝑥u�)Maybe persons ≈ 𝑓(𝑥1, 𝑥2, … , 𝑥u�|large context)
Page 26 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Communities vs. Numbers
Page 27 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Let’s go Quantitative
Nonsense
Black xor WhiteOptimal approachCounting apples andpeas per developer perminute times squarefoot
Sense
How does ⟨𝐴⟩ compare to ⟨𝐵⟩?Self-consistencyLikelihood of result compared torandom approach
Page 28 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Community Decomposition and Quality Estimation I
Quality Estimation
Meaningful community structures vs. random propertiesRandomise clusters
Rewire edgesKeep properties (e.g., “amount” of participation)
𝐻0: Clustering stems from unorganised, random process.Reject → Decomposition makes sense
Alternative ApproachKen Schwaber says: A team has seven people (plus or minus two). Full stop!
Page 29 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Community Decomposition and Quality Estimation I
So let’s study the Maths!Conductance u� ∈ [0, 1] of a community u� that is a sub-graph of a larger collaboration graph u� is de昀�ned as
u�u�(u�) ∶= | (u�, u� u�)|{(u�), (u� u�)}
, (1)
Identify a set of communities u� = {u�1, u�2, … , u�u�}, u�u� ⊆ u�. Conductance over all communities is given by u�u�(u�) = ∑u�∈u� u�u�(u�)/|u�|.Rewire an edge pair (u�, u�) and (u�, u�) is rewired to (u�, u�) and (u�, u�), maintaining the number of edges) but destroying the preferential attachement.Repeat to generate a set ℛ = {u�1, u�2, … , u�u�} of randomized graphs with u� (u�u�) = u� (u�) ∀u�. u�u�(u�) = |{u� ∈ u�| (u�) = u�}|/|u�| istmaintained, so
u�u�u�(u�) = u�u�(u�) ∀u�u� ∈ ℛ . (2)
De昀�ne
u�ℛ(u�) =|{u� ∈ [1, |ℛ|] | u�u�u�(u�) = u�}|
|ℛ|. (3)
and test the hypothesisu�0 ∶ u�ℛ(u� = u�u�(u�)) > u� , (4)
against the alternative hypothesis given by u�1 ∶ u�ℛ(⋅) ≤ u�.
Page 29 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Community Decomposition and Quality Estimation I
Page 29 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Community Decomposition and Quality Estimation I
Quality Estimation
Meaningful community structures vs. random propertiesRandomise clusters
Rewire edgesKeep properties (e.g., “amount” of participation)
𝐻0: Clustering stems from unorganised, random process.Reject → Decomposition makes sense
Alternative ApproachKen Schwaber says: A team has seven people (plus or minus two). Full stop!
Page 29 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Community Decomposition and Quality Estimation III
Cheat Sheet
Boxes: developer communities
Background color: strength ofcommunity (green=strong,yellow=weak)
Box border color: uniquecommunity identi昀�er
Pie chart: fraction of developerparticipation in given communitycolor
Node size: developers importanceaccording to centrality measure
Link thickness: strength ofrelationship between entities
Link color: red is inter-communityrelationship, black isintra-community relationship
Page 30 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Input needed: Validate/Refute
Developer Input
Qemu, Linux kernel, Apache httpd, Joomla, jQuery, PHP, Busybox, gcc,昀�refox, Perl, openssl, bootstrap, OpenStack, QTSee you at the Codeface booth!
Image source: greensandgills.files.wordpress.com
Page 31 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Web Frontend
Web Frontend
Page 32 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Web Frontend
Page 33 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Web Frontend
Page 33 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Web Frontend
Page 33 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Web Frontend
Page 33 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Web Frontend
Page 33 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Technological Challenges
Tasks
Parse Unstructured, real-worlddataBig data handlingMultivariate, high-dimensionalvisualisationStatistics, Regression, machinelearningGraph mining, clusteringLarge-scale automationDynamic, reactive web frontend
Technologies
TechniquesComputer Science, Mathematics,Physics, Linguistics, Statistics
Page 34 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Open Source, obviously!
Online RessourcesHomepage: siemens.github.io/codefaceGithub repo: github.com/siemens/codeface
Page 35 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
吀�anks for your interest!
Page 36 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Performance Measurements
●●●●●●●●●●●●
●
●
●
●●●●●●●●●●●●
●
●
●
●●●●●●
●
●●●●●●●●●●
●●●
●
●
●
●
10
100
1000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Cores
Min
uten
[log
. Ska
la]
● ● ● ●Bootstrap git Linux kernel QEMU
Page 37 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Performance Measurements
●
●
●
●
●
●
●
●●
●
●
●●
●●
● ● ●●
●● ● ● ● ● ● ● ● ● ●
● ● ● ● ● ● ● ●● ● ● ● ●
●●
● ● ● ● ● ● ● ● ●
0
1000
2000
3000
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16Cores
RA
M [M
iB]
● ● ● ●Bootstrap git Linux kernel QEMU
Page 37 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Performance Measurements
●●●●
●
●●
●
●
●
●
●
30
40
50
60
1 2 3 4 5 6 7 8 9 10 11 12Cores
Min
uten
● Apache httpd
Page 37 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Cooperation II
Linux
Page 38 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Cooperation II
QEMU
Page 38 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Cooperation II
Samba
Page 38 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Maintainance Load
Linux
Page 39 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Maintainance Load
QEMU
Page 39 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Maintainance Load
OpenSSL
Page 39 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology
Backup: Maintainance Load
Bootstrap
Page 39 Embedded Linux Conference San Jose – May 1st, 2014 W. Mauerer Siemens Corporate Technology