1
Junji Shimagaki
Yasutaka Kamei
Ahmed E. Hassan
Naoyasu Ubayashi
Shane McIntosh
A Study of the Quality-Impacting Practicesof Modern Code Review at Sony Mobile
2
Code review is an importantsoftware quality assurance practice
Programmer Code reviewer
3
Sony Mobile usesGerrit Code Review
tools
1. Commit message
2. Files under review
4
Sony Mobile usesGerrit Code Review
tools
1. Commit message
4. Review scores
3. Reviewers
2. Files under review
5
Code Review context at Sony Mobile
“Code-Review”. (e.g., syntax, grammar, logic..)
Code Review context at Sony Mobile
“Code-Review”. (e.g., syntax, grammar, logic..)
Test results onApplication crashes?
Reboot after 10 sec6 ?
7
Lax reviewing practices impact software quality
Reviewer Programmer
Poorly reviewed code
Code repository
McIntosh et al., EMSE 2015
Poorly reviewed code
ReviewerProgrammer
Code repository
McIntosh et al., EMSE 2015
8
How about at Sony Mobile?
Lax reviewing practices impact software quality
9
ApproachQuantitative
study
Qualitative study
Replication of McIntosh et al.,
EMSE 2015
Developer surveys at Sony Mobile with 100+ people
Implications
Better code review practices
validated by stakeholders
10
Review participation
Quick resultsSimple adaptation does not work!
Review coverage
Self approval✗Discussion volume✗
Un-review ratio✗
11
Sony's unique apps, HW
A software project for this.
Target system: A smartphone product of a release cycle for 6 months
A software project for this.
Target system: A smartphone product of a release cycle for 6 months
Sony's unique apps, HW
Chipset and modem
Android OS
Strong dependencies on third-party system12 s
Why might Sony Mobile be Different?
Third-party dependencies
OfflineCommunication
Embedded Software
Development
13
Third-party dependencies
Why might Sony Mobile be Different?
EmbeddedOffline SoftwareCommunication Development
Previous studied systems areless impacted by third-party dependencies.
14
Third-party dependencies
Why might Sony Mobile be Different?
Embedded Software
DevelopmentOffline
Communication
Previous studied systems rely on online communication methods. 15
16
Why might Sony Mobile be Different?
Embedded Software
Development
Third-party Offlinedependencies Communication
Previous studied systems are of applications at higher levels in the application stack
Do reviewing practices impact software quality?Review coverage
Un-review ratio
Third-party ratio
✗✓
Components with higher third-party codebase ratio are more defect-prone17
Review participation
Do reviewing practices impact software quality?
Discussion volume
Patch update activity
✗✓
Components with high patch update activity are less defect prone. 18
Review participation
Do reviewing practices impact software quality?
Self approval
Self verify
✗✓
Components which are prevailed with self verification practices are more defect prone
19
20
Do reviewing practices impact software quality?Review coverage
Self verify
Third-party ratio✓
Review participation
✓ Patch update activity
✓
Do reviewing practices impact software quality?Review coverage
✓ Third-party ratio
Review participation
✓ Self verify
✓ Patch update activity
21
Why are these metrics effective?Let's ask the developers!
22
Qualitative Study Approach
Presentation
Initial survey (93 stakeholders)
–---------–-------–---------–-------
–---------–-------–---------–-------
–---------–-------
–---------–-------
Interviewee's list
–---------–-------
–---------–-------
–---------–-------
–---------–-------
23
Qualitative Study Approach
Semi-structured Interviews(15 key engineers)
Presentation
Initial survey (93 stakeholders)
–---------–-------–---------–-------
–---------–-------–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
Interviewee's list Implications
–---------–-------
–---------–-------
–---------–-------
–---------–-------
Qualitative Study Approach
Semi-structured Interviews(15 key engineers)
Presentation
Initial survey (93 stakeholders)
–---------–-------–---------–-------
–---------–-------–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
Interviewee's list Implications
–---------–-------
–---------–-------
–---------–-------
–---------–-------
Validation survey (25 senior stakeholders)
–---------–-------–---------–-------
–---------–-------–---------–-------
–---------–-------
–---------–-------
ConfirmedImplicatio24ns
✓
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
–---------–-------
“An external codebase takes more time from me to understand the code and to develop patches.”
Developers require more time and effort to understand, extend, or repair components with
high third-party rates.
Why does third-party ratio matterat Sony Mobile?
Software engineer
✓92% of stakeholders agreed 25
Why does self-verify rate matterat Sony Mobile?
The self-verification practice is coloured by the author’s subjective
perspective,which may bias the testing procedures and
results.
“I understand the architecture, and I am the one who can test my commit properly.”
Software engineer
✓75% of stakeholders agreed 26
Why does patch update rate matter at Sony Mobile?
“Patch updates rate” captures developer effort in a way that is not diminished by
in-person discussion at Sony Mobile.
“… it is much easier to work withdirect communication ratherthan with the Gerrit tools.”Software architect
✓81% of stakeholders agreed 27
Discouraging the practice ofself-verification
Investigating ways to encouragepassive developers to participate more
in code review
28
What is Sony Mobile doing toadjust their reviewing process?
QA has new focus ontest coverage of external code
Investigating ways to encouragepassive developers to participate more
in code review
29
What is Sony Mobile doing toadjust their reviewing process?
QA has new focus ontest coverage of external code
Discouraging the practice ofself-verification
30in code review
What is Sony Mobile doing toadjust their reviewing process?
QA has new focus ontest coverage of external code
Discouraging the practice ofself-verification
Investigating ways to encouragepassive developers to participate more
31
32
33
34
35
36
Backup slides
37
Review coverage of a component
✓✓
✓ ✓
✓
✓✓
Review is performedat the Sony Mobile's Gerrit
✓✓✓✓ ✓✓✓
Code review participation metrics
✓✓✓
✓✓ ✓
Discussion volume recorded in Gerrit
Number of commits approved by her/his own
Self review only?Enough code review effort?
38
But, again, they do NOT share relationship with defect proneness
✓✓✓
✓✓ ✓
recorded in Gerritapproved by her/his own✗Number of commits
39
✗Discussion volume
Adjusted review participation metrics share relationship with defect-proneness
✓✓✓
✓✓ ✓
Patch updates activity
verified by her/his own
Lax reviewing practices are associated withdefect-proneness. 40
✓Number of commits ✓
41External components tend to be
InHouse
more defect-prone.
Defect-proneness declines as In-House ratio increases.
Reviewcoverage
No significant link with defect proneness
In-House shares a stronger relationship with defect proneness at Sony Mobile
Self-verify shares an increasing relationship with defect proneness.
Number ofself_verify commits
Defectproneness
(Logittransformed)
42
Defectproneness
(Logittransformed)
Patch updates activity
Patch updates activity shares an decreasing relationship with defect proneness.
43
44
Quantitative study
Replication of McIntosh et al.,
EMSE 2015
Do reviewing practices impact software quality at Sony Mobile?
Quantitative study
Qualitative study
with 100+ people45EMSE 2015
Replication of McIntosh et al.,
Developer surveys at Sony Mobile
Do reviewing practices impact software quality at Sony Mobile?
Quantitative study
Qualitative study
Replication of McIntosh et al.,
by stakeholders with 100+ people46EMSE 2015
Developer surveys at Sony Mobile
Do reviewing practices impact software quality at Sony Mobile?
Implications
Better code review practices validated
47
A software project for this.
Target system: A smartphone product of a release cycle for 6 months
700 components
...300 components
...We study defect-pronenessof those 1,000 components
Do reviewing practices impact software quality?
RQ1:Review coverage
RQ2:Review participation
48
Review coverage of a component
Code repository of 1 component
1 commit
49
50
Review coverage of a component
✓✓
✓ ✓
✓
✓✓
8 commits
4 reviewed
5 commits
1 reviewed
2 commits
2 reviewed
Review is performedat the Sony Mobile's Gerrit
Review coverage of a component
✓✓
✓ ✓
✓
✓✓
50% 20% 100%51
However, it does NOT share relationship with defect-proneness
✓✓
✓ ✓
✓
✓✓
✗50%52
✗20% ✗100%
At Sony Mobile, review status is equivalent to whether it is made 'In-House'
✓✓
✓ ✓
Sony Mobile's internal patches
Linux kernel's baseline commits
53
But, our defined bags look too small to represent 'In-House' made ratio
✓✓
✓ ✓
Commits during development of
Slipped historic kernel commits
54
55
We adjusted the definition of 'review coverage'
✓✓
✓ ✓
→ ??%
Proportion of 'In-House' commits in total
56
Adjusted review coverage shares relationship with defect-proneness
✓✓✓✓
✓
✓✓✓
✓✓ ✓
✓0.1% ✓1% ✓100%External originated components tend to be
more defect prone.
Do reviewing practices impact software quality?
RQ1:Review coverage
✓YES!In-House ratio
RQ2:Review participation
???57
58
Ok, code-reviewed but...
✓A reviewed commit
no guarantee of active participation
✓Definitions of participation
Who approved this
commit?
59
✓ Sufficient discussion
(effort) made?
Who approved this
commit?
60
Definitions of participation
✓Code review participation metrics
Sufficient discussion
(effort) made?
Number of commits approved by her/his own
Discussion volume recorded in Gerrit
WWhohodidaapppprroovveed tthhiiss
ccoommmmiitt??
61
✓ Sufficient discussion
(effort) made?
approved by her/his own✗Number of commits
recorded in Gerrit
✗Discussion volume
But, they do NOT share relationship with defect proneness
WWhohodidaapppprroovveed tthhiiss
ccoommmmiitt??
62
We adjust self-approval
We only counted the number of self “Code-Review”
We also count the number of self “Verified”63
Code review process atSony Mobile
Code reviewer
Programmer
1.Code.
2.Upload.
3. Review.
4.Verify on HW.
Gerrit server
5. Privileged.6. Submit.
64
65
Code review system at Sony Mobile
We adjust effort
“… it is much easier to work with direct communication rather than with the Gerrit tools.”
Software architect
66
No longer assume discussion is online.We introduce a new “patch update ratio”
✓Who verified this
commit?
Sufficient discussion
(effort) made?
verified by her/his own
✓Number of commitsactivity
67
✓Patch updates
Adjusted review participation metrics share relationship with defect-proneness