CHAPTER 12 – Conclusion Software Product & Process€¦ · •2.Project Management •3.Requirements •4.Domain Modeling •5.Testing •6.Design by Contract •7.Formal Specification

12. Conclusion

Overview • 1.Introduction • 2.Project Management • 3.Requirements • 4.Domain Modeling • 5.Testing • 6.Design by Contract • 7.Formal Specification • 8.Software Architecture • 9.Quality Control • 10.Software Metrics • 11.Refactoring

Articles • The Quest for the Silver Bullet • The Case of the Killer Robot

Professional Ethics • Cases

The future: Software Engineering Tools

CHAPTER 12 – Conclusion

!1 12. Conclusion

Software Product & Process

!2

•Software Process: Requirements Collection + Analysis + Design + Implementation+ Testing + Maintenance + Quality Assurance

•Software Product: Requirements Specification (= functional & non-functional) + System (= executable + code + documentation)

RequirementSpecification System

Requirement Collection

Analysis

Design Maintenance

Implementation Testing

+ Quality Assurance

+ Quality

Assurance

12. Conclusion

Evaluation Criteria

!3

RequirementSpecification System

2 evaluation criteria to assess techniques applied during process

Correctness •Are we building the right product? •Are we building the product right?

Traceability •Can we deduce which product components will be affected by changes?

For large scale projects, a feedback loop is necessary ... ⇒ Mix of Iterative and Incremental Development

12. Conclusion

Overview

!4

TestingTesting

Design by ContractImplementation

Use Cases

Requirements Collection

Domain ModelingAnalysis

Software ArchitectureFormal Specifications

Design

RefactoringMaintenance

Project ManagamentQuality ControlSoftware Metrics

Quality Assurance

Use Cases & User Stories

12. Conclusion

Project Management

!5

Project Management • = plan the work and work the plan • PERT and Gantt charts with various options • Critical path analysis and monitoring

Are we building the system right? • Deliver what’s required on time within budget • Calculate risk to the schedule via optimistic and pessimistic estimates • Monitor the critical path to detect delays early • Plan to re-plan to meet the deadline

Traceability? Project Plan ⇔ Requirements & System

• The purpose of a plan is to detect deviations as soon as possible • Small tasks + Milestones verifiable by customer

12. Conclusion

Requirements

!6

Use Cases • = Specify expected system behavior as a set of generic scenarios

User Stories • = Express expected functionality with the behaviour driven template

+ As a <user role> I want to <goal> so that <benefit>.

Are we building the system right? • Well specified scenarios help to verify system against requirements

Are we building the right system? • NO: validation by means of CRC Cards and role playing.

Traceability? Requirements ⇔ System

• Via proper naming conventions

Traceability? Requirements ⇔ Project Plan

• Use cases & User stories form good milestones

12. Conclusion

Domain Modeling

!7

CRC Cards • = Analyse system as a set of classes

+ ... each of them having a few responsibilities + ... and collaborating with other classes to fulfill these

responsibilities Feature Model

• a set of reusable and configurable requirements for specifying system families (a.k.a. product line)

Are we building the system right? • A robust domain model is easier to maintain

(= long-term reliability). Are we building the right system?

• CRC Cards and role playing validate use cases. • Feature diagrams make product differences (and choices) explicit

Traceability? • Via proper naming conventions

12. Conclusion

Testing

!8

Regression Testing • = Deterministic tests (little user intervention), answering true or false

Are we building the system right? • Tests only reveal the presence of defects, not their absence

yet ... Tests verify whether a system is as right as it was before changes Traceability?

• Link from requirements specification to system source code

Test techniques • Individual test are white box or black box tests

+ White box: exploit knowledge of internal structure - e.g., path testing, condition testing

+ Black box: exploit knowledge about inputs/outputs - e.g., input- and output partitioning + boundary conditions

12. Conclusion

Contractual Obligations Explicitly recorded in Interface • pre-condition = obligation to be satisfied by invoking method • post-condition = obligation to be satisfied by method being invoked • class invariant = obligation to be satisfied by both parties

Are we building the system right? • Recorded obligations prevent defects • and ... remain in effect during changes

Traceability? • Obligations express key requirements in source code

Weak or Strong?

Design by Contract

!9

Invoking Method Method being Invoked

Precondition weak: input is unconstrained strong: few input cases

Postcondition strong: lots of results weak: few results

12. Conclusion

Formal Specifications

!10

Input/Output Specifications • = include logic assertions (pre- and postconditions + invariants) in

algorithm ➡ prove assertions via formal reasoning

Algebraic Specifications • = type is described via the applicable operations

➡ deduce results via rewrite rules Logic-Based (Model-based) Specifications

• = logic declarations state propositions that must be true State-Based Specifications

• = Specify acceptable message sequences by means of state machine

Are we building the system right? • Makes verification easier

➡ generation of test cases ➡ deduction of contractual obligations

Traceability? • Extra intermediate representation may hinder traceability

12. Conclusion

Software Architecture

!11

Software Architecture • = Components & Connectors describing high-level view of a system. • Decomposition implies trade-offs expressed via coupling and cohesion. • Proven solutions to recurring problems are recorded as patterns.

Are we building the system right? • For the non-functional parts of the requirements

Traceability? • Extra level of abstraction may hinder traceability

12. Conclusion

Quality Control • = include checkpoints in the process to verify quality attributes • Formal technical reviews are very effective and cost effective!

Quality Standards (ISO9000 and CMM) • = Checklists to verify whether a quality system may be certified

Are we building the system right? Are we building the right system?

• Quality Control eliminates coincidence. Traceability?

• Only when part of the quality plan/system

Quality Control

!12

Project Concern = Deliver on time and within budget

External (and Internal) Product Attributes

Process Attributes

12. Conclusion

Software Metrics

!13

Effort and Cost Estimation • = measure early products to estimate costs of later products • algorithmic cost modeling, i.e. estimate based on previous experience

Correctness? • Algorithmic cost modeling provides reliable estimates (incl. risk factor)

Traceability? • Quantification of estimates allows for negotiations

Quality Assurance • = quantify the quality model • Via internal and external product metrics

Correctness & Traceability? • Software metrics are too premature too assure reliable assessment

12. Conclusion

Refactoring

!14

Refactoring Operation • = Behaviour-preserving program transformation • e.g., rename, move methods and attributes up and down in the

hierarchy Refactoring Process

• = Improve internal structure without altering external behaviour Code Smell

• = Symptom of a not so good internal structure • e.g, complex conditionals, duplicated code

Are we building the system right? • Behaviour preserving ⇒ as right as it was before (cfr. tests)

Are we building the right system? • Improve internal structure ⇒ cope with requirements mismatches.

Traceability? • Renaming may help to maintain naming conventions • Refactoring makes it (too) easy to alter the code without changing the

documentation

12. Conclusion

Overview • 1.Introduction • 2.Project Management • 3.Use Cases • 4.Domain Modeling • 5.Testing • 6.Design by Contract • 7.Formal Specification • 8.Software Architecture • 9.Quality Control • 10.Software Metrics • 11.Refactoring





!15 12. Conclusion

Assignment: Study an Article of your Choice

!16

➡Find and read both of the following articles.Pick the one you liked the most, study it carefully and compare the article withthe course contents.

The Quest for the Silver Bullet • [Broo87] Frederick P. Brooks, Jr. “No Silver Bullet: Incidents and Accidents in Software

Engineering” IEEE Computer, April 1987. • See also [Broo95] Frederick P. Brooks, Jr. “The Mythical Man-Month (20th anniversary

edition)” Addison-Wesley. ➡The article is more than 15 years old. Yet, it succeeds in explaining why there will

never be an easy solution for solving the problems involved in building large and complex software systems.

The Killer Robot Case • [Epst94] Richard G. Epstein, "The use of computer ethics scenarios in software

engineering education: the case of the killer robot.", Software Engineering Education: Proceedings of the 7th SEI CSEE Conference

➡The article is a faked series of newspaper articles concerning a robot which killed its operators due to a software fault. The series of articles conveys the different viewpoints one might have concerning the production of quality software.

12. Conclusion

Software Engineering & Society

!17

Your personal future isat stake (e.g., Y2K lawsuits)

Huge amounts of money are at stake (e.g., Ariane V crash)

Lives are at stake (e.g., automatic pilot)

Corporate success or failure is at stake (e.g., telephone billing, VTM launching 2nd channel)

Software became Ubiquitous Our society is vulnerable! ⇒ Deontology, Licensing, …

12. Conclusion

Code of Ethics

!18

• Software Engineering Code of Ethics and Professional Practice + ACM-site: http://www.acm.org/serving/se/code.htm + IEEE-site: http://computer.org/tab/swecc/code.htm

• Recommended by + IEEE-CS (Institute of Electrical and Electronics Engineers -

Computer Society) + ACM (Association for Computing Machinery)

• “Software Engineering Code of Ethics is Approved”, Don Gotterbarn, Keith Miller, Simon Rogerson, Communications of the ACM, October 1999, Vol42, no. 10, pages 102-107.

+ Announces the revised 5.2 version of the Code

• “Using the New ACM Code of Ethics in Decision Making”, Ronald E. Anderson, Deborah G. Johnson, Donald Gotterbarn, Judith Perrolle, Communications of the ACM, February 1993, Vol36, no. 2, pages 98-104.

+ Discusses 9 cases of situations you might encounter and how (an older version of) the code address them

12. Conclusion

Code of Ethics: 8 Principles• ACM-site: http://www.acm.org/serving/se/code.htm • IEEE-site: http://computer.org/tab/swecc/code.htm

1. PUBLIC • Software engineers shall act consistently with the public interest.

2. CLIENT AND EMPLOYER • Software engineers shall act in a manner that is in the best interests of their client and

employer consistent with the public interest. 3. PRODUCT

• Software engineers shall ensure that their products and related modifications meet the highest professional standards possible.

4. JUDGMENT • Software engineers shall maintain integrity and independence in their professional

judgment. 5. MANAGEMENT

• Software engineering managers and leaders shall subscribe to and promote an ethical approach to the management of software development and maintenance.

6. PROFESSION • Software engineers shall advance the integrity and reputation of the profession

consistent with the public interest. 7. COLLEAGUES

• Software engineers shall be fair to and supportive of their colleagues. 8. SELF

• Software engineers shall participate in lifelong learning regarding the practice of their profession and shall promote an ethical approach to the practice of the profession.

!19 12. Conclusion

Case: Privacy - Description

!20

Case Description • You consult a company concerning a database for personnel

management. • Database will include sensitive data: performance evaluations, medical

data. • System costs too much and company wants to cut back in security.

What does the code say? • 1.03. Approve software only if they have a well-founded belief that it is

safe, meets specifications, passes appropriate tests, and does not diminish quality of life, diminish privacy or harm the environment. The ultimate effect of the work should be to the public good.

• 3.12. Work to develop software and related documents that respect the privacy of those who will be affected by that software.

➡ Situation is unacceptable.

12. Conclusion

Case study: Privacy - Solution

!21

Applicable Clauses • 1.02. Moderate the interests of the software engineer, the employer, the

client and the users with the public good. • 1.04. Disclose to appropriate persons or authorities any actual or

potential danger to the user, the public, or the environment, that they reasonably believe to be associated with software or related documents.

• 2.07. Identify, document, and report significant issues of social concern, of which they are aware, in software or related documents, to the employer or the client.

• 6.09. Ensure that clients, employers, and supervisors know of the software engineer's commitment to this Code of ethics, and the subsequent ramifications of such commitment.

Actions • Try to convince management to keep high security standards. • Include in contract a clause to cancel contract when against the code of

ethics. • Alarm other institutions if you later hear that others accepted the

contract.

12. Conclusion

Case: UnreliabilityCase Description

• You’re the team leader of a team building software for calculating taxes. • Your team and your boss are aware that the system contains a lot of

defects. Consequently you state that the product can’t be shipped in its current form.

• Your boss ships the product anyway, with a disclaimer “Company X is not responsible for errors resulting from the use of this program”.

What does the code say? • 1.03. Approve software only if they have a well-founded belief that it is

safe, meets specifications, passes appropriate tests, and does not diminish quality of life, diminish privacy or harm the environment. The ultimate effect of the work should be to the public good.

• 5.11. Not ask a software engineer to do anything inconsistent with this Code.

• 5.12. Not punish anyone for expressing ethical concerns about a project.

➡ Disclaimer does not apply: can only be made in “good conscience”.

➡ In court you can not be held liable.

!22

12. Conclusion

Overview • 1.Introduction • 2.Project Management • 3.Use Cases • 4.Domain Modeling • 5.Testing • 6.Design by Contract • 7.Formal Specification • 8.Software Architecture • 9.Quality Control • 10.Software Metrics • 11.Refactoring





!23 Agile Quality Assurance

Introduction • Reliability vs. Agility

Mining Software Repositories

• Tests (= visualisation) + How good was our testing process?

• Bugs (= text mining) + Who should fix this bug? + How long will it take to fix this bug? + What is the severity of this bug?

• Expertise (= social network analysis) + Who are the key personalities? + Who can help me with this file? + Where should we focus our (regression) tests?

Conclusion • The hype-cycle

The Future — Software Engineering Tools

!24

12. Conclusion

Innovation

!25

Business Models

1971 — Starbucks (seattle)

(Vienna) 1529 — European coffee house

1475 — Kiva Han coffee house (Constantinople)

Underlying Technology

1946 — commerical piston espresso machine

1908 — patent on paper filter

2001 — senseo2000 — nespresso

Technolo

gy changes

every

20 years

…

Underlyin

g business

models

rarel

y change!

12. Conclusion

Innovation in ICT

!26

Embedded

Internet

ENIAC, 1945 IBM PC, 1981 iPad, 2010NEC ultralite, 1989

Underlying

Technology

Technolo

gy changes

every

5 years

…

Underlyin

g business

models

change o

ften!

12. Conclusion

Market pressure in ICT

!27

RELIABILITY AGILITY

Measure of innovation • # products in portfolio younger than 5 years

+ in ICT usually more than 1/2 the portfolio

Significant investment in R&D • more products … faster

12. Conclusion

Reliability vs. Agility

!28

Software is vital to our society ⇒ Software must be reliable

Traditional Software Engineering Reliable = Software without bugs

Today’s Software Engineering Reliable = Easy to Adapt

Striving for RELIABILITY

(Optimise for perfection)

Striving for AGILITY

(Optimise fordevelopment speed)

On the Originof Species

12. Conclusion

Software Evolution

!29

It is not age that turns a piece of software into a legacy system, but the rate at which it has been developed and adapted without being reengineered.

[Deme02] Demeyer, Ducasse and Nierstrasz: Object-Oriented Reengineering Patterns

Components are very brittle … After a while one inevitably resorts to glue :)

Take action here … … instead ofwaiting until here.

12. Conclusion

Software Repositories & Archives

!30

Version Control • CVS, Subversion, … • Rational ClearCase • Perforce, • Visual Source Safe • …

Issue Tracking • Bugzilla • BugTracker.NET • ClearQuest • JIRA • Mant • Visual Studio Team Foundation

Server • …

Automate the Build • make • Ant, Maven • MSBuild • OpenMake • Build Forge • …

Automated Testing • HP QuickTest Professional • IBM Rational Functional Tester • Maveryx • Selenium • TestComplete • Visual Studio Test Professional

Microsoft 2010 • …

… mailing archives, newsgroups, chat-boxes, facebook, twitter, …

All of a sudden empirical research has what any empirical science needs: a large corpus of objects to analyze.

[Bertrand Meyer's technology blog]

12. Conclusion


!31

Conferences • 2017—14th edition, Buenos Aires,

Argentina

• 2016—13th edition, Austin, Texas • 2015—12th edition, Florence, Italy • 2014—11th edition, Hyderabad, India • 2013—10th edition, San Francisco, CA,

USA • 2012—9th edition, Zürich, CH • 2011—8th edition, Honolulu, HI, USA • 2010—7th edition, Cape Town, ZAF • 2009—6th edition, Vancouver, CAN • 2008—5th edition, Leipzig, DEU • 2007—4th edition, Minneapolis, MN, USA • 2006—3rd edition, Shanghai, CHN • 2005—2nd edition, Saint Luis, MO, USA • 2004—1st edition, Edinburgh, UK

The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects.

Hall of Fame—Mining Challenge Winners • 2017 — TravisTorrent (Github) • 2016 — BOA (SourceForge & Github) • 2015 — StackOverflow • 2014—Sentiment Analysis of Commit

Messages in GitHub: An Empirical Study • 2013—Encouraging User Behaviour with

Achievements: An Empirical Study [StackOverflow]

• 2012—Do the Stars Align? Multidimensional Analysis of Android's Layered Architecture

• 2011—Apples Vs. Oranges? An exploration of the challenges of comparing the source code of two software systems [Netbeans+Eclipse]

• 2010—Cloning and Copying between GNOME Projects

• 2009—On the use of Internet Relay Chat (IRC) meeting by developers of the GNOME GTK+ project

• 2008—A newbie's guide to Eclipse APIs • 2007—Mining Eclipse Developer Contributions

via Author-Topic Models • 2006—A study of the contributors of

PostgreSQL

Agile Quality Assurance


!32





• Expertise (= social network analysis) + Who are the key personalities? + Who can help me with this file? + Where should we focus our (regression) tests?


12. Conclusion

Test Monitor — Change History

!33

0��

0��

http://swerl.tudelft.nl/bin/view/Main/TestHistory Case = Checkstyle

Single Test Unit Testing

Integration tests

Phased Testing

Steady Testing & Coding

12. Conclusion

Test Monitor — Growth History

!34

Few Tests

Steady Testing & Coding

Increased Test Activity

12. Conclusion

Test Monitor — Coverage Evolution

!35

Sudden Drop!?

Good Class Coverage



!36





• Expertise (= social network analysis) + Who are the key personalities? + Who can help me with this file? + Where should we focus our (regression) tests? + Stack Overflow


Description ⇒ text Mining

Stack Traces ⇒ Link to source code

Product/Component Specific vocabulary

Suggestions?

12. Conclusion

Bug Database

------------------------------------

------------------------------------Bug Reports

------------------------

(1) & (2)Extract and preprocess bug reports

!38

prediction

• true positive • false positive • true negative • false negative ⇒ precision & recall

Text Mining

New report---------------

(4) Predict • severity • assigned-to • estimated time

(3) Training predictor

12. Conclusion

Results

!39

Question Cases Precision Recall

Who should fix this bug? Eclipse, Firefox, gcc

eclipse: 57% firefox: 64%

gcc: 6%

—

How long will it take to fix this bug? JBoss

depends on the component many similar reports: off by one hour

few similar reports: off by 7 hours

What is the severity of this bug? Mozilla, Eclipse, Gnome

mozilla, eclipse:67% - 73%

gnome: 75%-82%

mozilla, eclipse:50% - 75%

gnome: 68%-84%

Promising results but … •how much training is needed? •how reliable is the data?

(estimates, severity, assigned-to) •does this generalize? (on industrial scale?)

⇒ replication is needed


Table Of Contents

!40







Space

© Inspired by Margaret-Anne (Peggy) Storey; Keynote for MSR 2012, Zurich, Switzerland

Place

12. Conclusion

270 M. Pinzger and H.C. Gall

Fig. 13.1 Communication paths extracted from the Eclipse Platform Core developer mailing list

way. Figure 13.1 shows an excerpt of a mail thread from the online mailing listarchive of Eclipse Platform Core.

A mail addressed to a mailing list is processed by the reflector and sent to allsubscribers. This means that the To: address is always the mailing list addressitself; hence, there is no explicit receiver address. In our example this address isplatform-core-dev. To model the communication path between sender and receiver,the receiver needs to be reconstructed from subsequent answering mails. The iden-tification of the sender is given by the From: field which is denoted by the name onthe right side of a message. For determining the receivers of emails we analyze thetree structure of a mail thread and compute the To: and Cc: paths.

Figure 13.1 illustrates the two paths in our example thread whereas gray arrowsdenote the To: path and light gray arrows the Cc: path. A gray arrow is establishedbetween an initial mail and its replies. For example, Philippe Ombredanne is firstreplying to the mail of Thomas Watson, so in this case Philippe Ombredanne is thesender and Thomas Watson is the receiver of the mail. To derive Cc: receivers weconsider the person answering a mail as an intended receiver of this mail. In case thisperson is already the To: receiver (as it applies with the mails number 3–5 betweenBob Foster and Pascal Rapicault) no additional path is derived, because we assumethat a mail is not sent to a person twice.

For importing the data from the mailing lists archives we extended the iQuesttool. iQuest is part of TeCFlow,1 a set of tools to visualize the temporal evolutionof communication patterns among groups of people. It contains a component toparse mailing lists and import them into a MySQL database. Our extension aims atincluding the follow-up information of mails to fully reconstruct the structure of amail-thread. The sample thread shown above consists of 15 mails that result in 25communication paths.

13.3.1.2 Deriving Communication Paths from Bug Reports

The second source outlined for modeling communication paths is a bug trackingrepository, such as, Bugzilla. Bugzilla users create reports and comments and give

1 http://www.ickn.org/ickndemo/

13 Dynamic Analysis of Communication and Collaboration 273

Fig. 13.2 Integrated model for representing communication and collaboration data in OSSprojects

person. In every other case the person is assumed to be unknown and a new personentity is added to the database.

While this algorithm works fine for person information obtained from Bugzillaand mailing lists, there are problems with matching persons obtained from CVS logdata. Typically, the author stored in CVS logs indicates the CVS user name, but notthe real name of a person. Because of the high number of false matches, the mappingof these persons is done manually.

In addition to the information of a person, email addresses contain domain infor-mation that, for example, denotes the business unit of a developer. We use thisinformation to assign developers to teams. We obtain email addresses that havebeen generated with MHonArc.2 The problem is that MHonArc provides a spammode which deters spam address harvesters by hiding the domain information ofemail addresses. For example, the email address of Chris McGee is displayed as

2 http://www.mhonarc.org/

Construct Social Network

!42

Mailing lists + Bug reports

+ CVS Logs⇒

12. Conclusion

Code Ownership (& Alien Commits)

!43


Fig. 13.7 Collaboration in the Eclipse Platform Core project observed in the time from 14th to28th February 2005

information ideally gets processed. Figure 13.8 illustrates the communication viathe developer mailing list and Bugzilla data over 21 months. The amount of com-munication (i.e., the number of communication paths reconstructed from bug reportsand emails) is illustrated by the width of edges. The wider the edges of a person’snode are, the more this person communicated with other developers.

The graph in Fig. 13.8 shows the core development team whose members fre-quently communicate with each other. Rafael Chaves, Dj Houghton, Jeff McafferThomas Watson, John Arthorne, and Pascal Rapicault form the core team. They arethe communicators who keep the network together and play an important role withinthe project. Interesting is that they all belong to either the group @ca.ibm.com [email protected] as indicated by the shadows of rectangles representing these devel-opers. Another highly connected group is formed by the Swiss team (@ch.ibm.ch)whose members are represented by the nodes on the right side of the graph. Almosteach developer of the Swiss team is in touch with the US team; however, MarkusKeller and Daniel Megert turn out as the main communicators between the twoteams during that time.

Another interesting finding concerns the environment via which the developerscommunicated. Most of the communication was via Bugzilla bug reports indicatedby the gray edges. Only the core team also used the mailing list to discuss Eclipse

Code Owners

Alien Commit

12. Conclusion

Key personalities

!44

280 M. Pinzger and H.C. Gall

Fig. 13.8 Communicators of the Eclipse Platform Core project as from May 2004 to February2006

Platform Core relevant issues. Such findings are of particular interest when newways of communication are considered.

13.5.4 Project Dynamics

Newcomers should be integrated fast into development teams to rapidly increaseproductivity and foster synergy among team members. With STNA-Cockpit theproject manager can observe how newcomers actually are integrated into theirteams. For this, the project manager selects the starting observation period anduses the time-navigation facility of STNA-Cockpit to investigate the evolution ofthe communication and collaboration network over time. The graph animationallows the project manager to observe how the newcomer behaves concerningcommunication and collaboration with other team members. In particular, shelooks for communication paths that tell her the newcomer gets actively involved

communicators

“swiss group”

12. Conclusion

Socialization

!45


(b) 2nd half of April 2004(a) 1st half of April 2004

(d) 2nd half of June 2004(c) 1st half of June 2004

Dj Houghton

John Arthorne

Rafael Chaves Jeff Mcaffer

Pascal Rapicault

Kevin Barnes

org.eclip ources

org.eclip untime

org.eclipse.co unt e.compatibility

se.core.res

se.core.r

re.r im

Rafael Chaves

Daniel Megert

John Arthorne

Dj Houghton

Da anson

Jared Burns

Eric Gamma

Kevin Barnes

org.eclipse.core.resources

rin Swh

Pascal Rapicault

Dj Houghton

Sonia Dimitrov

John ArthorneThomas Watson

Jeff Mcaffer

Dirk Baeumer

Rafael Chaves

Erich Gamma

Darin Wright

Kevin Barnes

Darin Swanson

org.eclip co esources

org.eclip co .runtime

se. re.r

se. re

John Arthorne

Rafael Chaves

Darin Wright

Darin Swanson

Kevin Barnes

org.eclip

Luc Bourlier

se.co ariablesre.v

Fig. 13.9 Socialization of Kevin Barness in the Eclipse Platform Core project

in discussions on the developers mailing lists and bug reports. In addition, sheobserves whether the newcomer contributes to the plug-ins and finally takes overresponsibility of portions of the source code.

Consider the following scenario in which Kevin Barness is entering the USteam @ca.ibm.com of the Eclipse Platform Core project in April 2004. Figure 13.9depicts various snapshots taken from the network created for subsequent points intime. Kevin Barness is starting as a developer in the Eclipse Platform Core team atthe beginning of April 2004. His first action is to get in touch with some key person-alities of the project, namely Rafael Chaves and John Arthorne. His first contacts arevisualized by the graphs depicted by (Fig. 13.9a, b). In the following weeks he com-municates also with other project members to get more involved into the project (seeFig. 13.9c), namely Darin Wright and Darin Swanson. As (Fig. 13.9d) illustrates,Darin Wright is a developer and Darin Swanson the owner of the files that are goingto be modified by Kevin. Rafael Chaves seems to play the role of the connector whointroduces the new developer Kevin Barness to the responsible persons. Accordingto the graph, he is communicating with two senior developers.

Socialization of Kevin Barness in the Eclipse Platform Core project

12. Conclusion

Expertise Browser

!46

Used within a geographically dispersed team •120 developers at two sites (Germany and England)

grew to 250 developers (incl. satellite site in France) • satellite teams: locate expertise • established teams: who is doing what?

12. Conclusion

Code Ownership vs. Code Quality

!47

3

Ownership of A.dll by Developers

(a) A.dll

3

Ownership of B.dll by Developers

(b) B.dll

Figure 2: Ownership graphs for two binaries in Windows

of experience used by Mockus and Weiss also used the num-ber of changes. However, prior literature [14] has shown highcorrelation (above 0.9) between number of changes and num-ber of lines contributed and we have found similar results inWindows, indicating that our results would not change sig-nificantly. With these terms defined, we now introduce ourmetrics.

• Minor – number of minor contributors

• Major – number of major contributors

• Total – total number of contributors

• Ownership – proportion of ownership for the contrib-utor with the highest proportion of ownership

Figure 1 shows the proportion of commits for each of thedevelopers that contributed to abocomp.dll in Windows, indecreasing order. This library had a total of 918 commitsmade during the development cycle. The top contributingengineer made 379 commits, roughly 41%. Five engineersmade at least 5% of the commits (at least 46 commits).Twelve engineers made less than 5% of the commits (lessthan 46 commits). Finally, there were a total of seventeenengineers that made commits to the binary. Thus, our met-rics for abocomp.dll are:

Metric Value

Minor 12Major 5Total 17Ownership 0.41

4. HYPOTHESESWe begin with the observation that a developer with lower

expertise is more likely to introduce bugs into the code. Adeveloper who has made a small proportion of the commitsto a binary likely has less expertise and is more likely tomake an error. We expect that as the number of develop-ers working on a component increases, the component maybecome “fragmented” and the di⇤culty of vetting and coor-dinating all these minor contributions becomes an obstacleto good quality. Thus if Minor is high, quality su�ers.

Hypothesis 1 - Software components with many minor con-tributors will have more failures than software componentsthat have fewer.

We also look at the proportion of ownership for the highestcontributing developer for each component (Ownership). IfOwnership is high, that indicates that there is one devel-oper who “owns” the component and has a high level of ex-pertise. This person can also act as a single point of contactfor others who need to use the component, need changes toit, or just have questions about it. We theorize that whensuch a person exists, the software quality is higher resultingin fewer failures.

Hypothesis 2 - Software components with a high level ofownership will have fewer failures than components withlower top ownership levels.

If the number of minor contributors negatively a�ects soft-ware quality, the next question to ask is, “Why do somebinaries have so many minor contributors?” We have ob-served both at Microsoft and also within OSS projects suchas Python and Postgres, that during the process of mainte-nance, feature addition, or bug fixing, owners of one compo-nent often need to modify other components that the firstrelies on or is relied upon by. As a simple example, a devel-oper tasked with fixing media playback in Internet Explorermay need to make changes to the media playback interface li-brary even though the developer is not the designated ownerand has limited experience with this component. This leadsto our hypothesis.

Hypothesis 3 - Minor contributors to components will beMajor contributors to other components that are relatedthrough dependency relationships

Finally, if low-expertise contributions do have a large im-pact on software quality, then we expect that defect predic-tion techniques will be a�ected by their inclusion or removal.We therefore replicate prior defect prediction techniques andcompare results when using all data, data derived only fromchanges by minor contributors and, and data derived onlyfrom changes to major contributors. We expect that whendata from minor contributors is removed, the quality of thedefect prediction will su�er.

3

Ownership of A.dll by Developers

(a) A.dll

3

Ownership of B.dll by Developers

(b) B.dll

Figure 2: Ownership graphs for two binaries in Windows

of experience used by Mockus and Weiss also used the num-ber of changes. However, prior literature [14] has shown highcorrelation (above 0.9) between number of changes and num-ber of lines contributed and we have found similar results inWindows, indicating that our results would not change sig-nificantly. With these terms defined, we now introduce ourmetrics.

• Minor – number of minor contributors

• Major – number of major contributors

• Total – total number of contributors

• Ownership – proportion of ownership for the contrib-utor with the highest proportion of ownership

Figure 1 shows the proportion of commits for each of thedevelopers that contributed to abocomp.dll in Windows, indecreasing order. This library had a total of 918 commitsmade during the development cycle. The top contributingengineer made 379 commits, roughly 41%. Five engineersmade at least 5% of the commits (at least 46 commits).Twelve engineers made less than 5% of the commits (lessthan 46 commits). Finally, there were a total of seventeenengineers that made commits to the binary. Thus, our met-rics for abocomp.dll are:

Metric Value

Minor 12Major 5Total 17Ownership 0.41

4. HYPOTHESESWe begin with the observation that a developer with lower

expertise is more likely to introduce bugs into the code. Adeveloper who has made a small proportion of the commitsto a binary likely has less expertise and is more likely tomake an error. We expect that as the number of develop-ers working on a component increases, the component maybecome “fragmented” and the di⇤culty of vetting and coor-dinating all these minor contributions becomes an obstacleto good quality. Thus if Minor is high, quality su�ers.

Hypothesis 1 - Software components with many minor con-tributors will have more failures than software componentsthat have fewer.

We also look at the proportion of ownership for the highestcontributing developer for each component (Ownership). IfOwnership is high, that indicates that there is one devel-oper who “owns” the component and has a high level of ex-pertise. This person can also act as a single point of contactfor others who need to use the component, need changes toit, or just have questions about it. We theorize that whensuch a person exists, the software quality is higher resultingin fewer failures.

Hypothesis 2 - Software components with a high level ofownership will have fewer failures than components withlower top ownership levels.

If the number of minor contributors negatively a�ects soft-ware quality, the next question to ask is, “Why do somebinaries have so many minor contributors?” We have ob-served both at Microsoft and also within OSS projects suchas Python and Postgres, that during the process of mainte-nance, feature addition, or bug fixing, owners of one compo-nent often need to modify other components that the firstrelies on or is relied upon by. As a simple example, a devel-oper tasked with fixing media playback in Internet Explorermay need to make changes to the media playback interface li-brary even though the developer is not the designated ownerand has limited experience with this component. This leadsto our hypothesis.

Hypothesis 3 - Minor contributors to components will beMajor contributors to other components that are relatedthrough dependency relationships

Finally, if low-expertise contributions do have a large im-pact on software quality, then we expect that defect predic-tion techniques will be a�ected by their inclusion or removal.We therefore replicate prior defect prediction techniques andcompare results when using all data, data derived only fromchanges by minor contributors and, and data derived onlyfrom changes to major contributors. We expect that whendata from minor contributors is removed, the quality of thedefect prediction will su�er.

Software components with many minor contributors will have more failures than software components that have fewer.

Software components with a high level of ownership will have fewer failures than components with lower top ownership levels.

Data from Windows Vista and Windows 7

Q&A support

In CHI2016 ProceedingsAgile Quality Assurance







Table Of Contents

!50

12. Conclusion

Hype Cycle

!51

Hype Cycle © Gartner

Vis

ibili

ty

Maturity

Technology Trigger

Peak of Inflated

Expectations

Trough ofDisillusionment

Slope ofEnlightenment

Plateau ofProductivity

12. Conclusion

The Future?

!52

Personal Opinion

Peak of Inflated

Expectations

Hype Cycle © Gartner

Vis

ibili

ty

Maturity

Technology Trigger

Trough ofDisillusionment

Slope ofEnlightenment

Plateau ofProductivity

IBM (Patents ) ⇒ Eclipse

Microsoft Research⇒ Team Foundation Server

12. Conclusion

Summary (i)You should know the answers to these questions

• Name 3 items from the code of ethics and provide a one-line explanation. • If you are an independent consultant, how can you ensure that you will not have to act

against the code of ethics? • What would be a possible metric for measuring the amount of innovation of a

manufacturing company? • What can we do to avoid that glue-code erodes our component architecture? • Explain in 3 sentences how a software tool could recommendation on certain fields in a

bug report (severity, assigned-to, estimated time). • Which components would be more susceptible to defects: the ones with a high level of

ownership or the ones with a low level of ownership? When you chose the “No Silver Bullet” paper

• What’s the distinction between essential and accidental complexity? • Name 3 reasons why the building of software is essentially a hard task? Provide a one-

line explanation. • Why is “object-oriented programming” no silver bullet? • Why is “program verification” no silver bullet? • Why are “components” a potential silver bullet?

When you chose the “Killer Robot” paper • Which regression tests would you have written to prevent the “killer robot”? • Was code reviewing applied as part of the QA process? Why (not)? • Why was the waterfall process disastrous in this particular case? • Why was the user-interface design flawed?

!53 12. Conclusion

Summary (ii)Can you answer the following questions?

• You are an experienced designer and you heard that the sales people earn more money than you do. You want to ask your boss for a salary-increase; how would you argue your case?

• Software products are usually released with a disclaimer like “Company X is not responsible for errors resulting from the use of this program”. Does this mean that you shouldn’t test your software? Motivate your answer.

• Your are a QA manager and are requested to produce a monthly report about the quality of the test process. How would you do that?

• Give 3 scenarios to illustrate why it would be interesting to know which developers are most experienced with a give piece of code.

• It has been demonstrated that it is feasible to make reliable recommendations concerning fields in a bug report. What is still needed before such a recommendation tool can be incorporated into state-of-the art tools (bugzilla, jira, …)

When you chose the “No Silver Bullet” paper • Explain why incremental development is a promising attack on conceptual essence. Give

examples from the different topics addressed in the course. • “Software components” are said to be a promising attack on conceptual essence. Which

techniques in the course are applicable? Which techniques aren’t? When you chose the “Killer Robot” paper

• Recount the story of the Killer Robot case. List the three most important causes for the failure and argue why you think these are the most important.

!54

Documents

CHAPTER 12 – Conclusion Software Product & Process€¦ · •2.Project Management •3.Requirements •4.Domain Modeling •5.Testing •6.Design by Contract •7.Formal Specification