Upload
others
View
7
Download
0
Embed Size (px)
Citation preview
12. Conclusion
Overview • 1.Introduction • 2.Project Management • 3.Requirements • 4.Domain Modeling • 5.Testing • 6.Design by Contract • 7.Formal Specification • 8.Software Architecture • 9.Quality Control • 10.Software Metrics • 11.Refactoring
Articles • The Quest for the Silver Bullet • The Case of the Killer Robot
Professional Ethics • Cases
The future: Software Engineering Tools
CHAPTER 12 – Conclusion
!1 12. Conclusion
Software Product & Process
!2
•Software Process: Requirements Collection + Analysis + Design + Implementation+ Testing + Maintenance + Quality Assurance
•Software Product: Requirements Specification (= functional & non-functional) + System (= executable + code + documentation)
RequirementSpecification System
Requirement Collection
Analysis
Design Maintenance
Implementation Testing
+ Quality Assurance
+ Quality
Assurance
12. Conclusion
Evaluation Criteria
!3
RequirementSpecification System
2 evaluation criteria to assess techniques applied during process
Correctness •Are we building the right product? •Are we building the product right?
Traceability •Can we deduce which product components will be affected by changes?
For large scale projects, a feedback loop is necessary ... ⇒ Mix of Iterative and Incremental Development
12. Conclusion
Overview
!4
TestingTesting
Design by ContractImplementation
Use Cases
Requirements Collection
Domain ModelingAnalysis
Software ArchitectureFormal Specifications
Design
RefactoringMaintenance
Project ManagamentQuality ControlSoftware Metrics
Quality Assurance
Use Cases & User Stories
12. Conclusion
Project Management
!5
Project Management • = plan the work and work the plan • PERT and Gantt charts with various options • Critical path analysis and monitoring
Are we building the system right? • Deliver what’s required on time within budget • Calculate risk to the schedule via optimistic and pessimistic estimates • Monitor the critical path to detect delays early • Plan to re-plan to meet the deadline
Traceability? Project Plan ⇔ Requirements & System
• The purpose of a plan is to detect deviations as soon as possible • Small tasks + Milestones verifiable by customer
12. Conclusion
Requirements
!6
Use Cases • = Specify expected system behavior as a set of generic scenarios
User Stories • = Express expected functionality with the behaviour driven template
+ As a <user role> I want to <goal> so that <benefit>.
Are we building the system right? • Well specified scenarios help to verify system against requirements
Are we building the right system? • NO: validation by means of CRC Cards and role playing.
Traceability? Requirements ⇔ System
• Via proper naming conventions
Traceability? Requirements ⇔ Project Plan
• Use cases & User stories form good milestones
12. Conclusion
Domain Modeling
!7
CRC Cards • = Analyse system as a set of classes
+ ... each of them having a few responsibilities + ... and collaborating with other classes to fulfill these
responsibilities Feature Model
• a set of reusable and configurable requirements for specifying system families (a.k.a. product line)
Are we building the system right? • A robust domain model is easier to maintain
(= long-term reliability). Are we building the right system?
• CRC Cards and role playing validate use cases. • Feature diagrams make product differences (and choices) explicit
Traceability? • Via proper naming conventions
12. Conclusion
Testing
!8
Regression Testing • = Deterministic tests (little user intervention), answering true or false
Are we building the system right? • Tests only reveal the presence of defects, not their absence
yet ... Tests verify whether a system is as right as it was before changes Traceability?
• Link from requirements specification to system source code
Test techniques • Individual test are white box or black box tests
+ White box: exploit knowledge of internal structure - e.g., path testing, condition testing
+ Black box: exploit knowledge about inputs/outputs - e.g., input- and output partitioning + boundary conditions
12. Conclusion
Contractual Obligations Explicitly recorded in Interface • pre-condition = obligation to be satisfied by invoking method • post-condition = obligation to be satisfied by method being invoked • class invariant = obligation to be satisfied by both parties
Are we building the system right? • Recorded obligations prevent defects • and ... remain in effect during changes
Traceability? • Obligations express key requirements in source code
Weak or Strong?
Design by Contract
!9
Invoking Method Method being Invoked
Precondition weak: input is unconstrained strong: few input cases
Postcondition strong: lots of results weak: few results
12. Conclusion
Formal Specifications
!10
Input/Output Specifications • = include logic assertions (pre- and postconditions + invariants) in
algorithm ➡ prove assertions via formal reasoning
Algebraic Specifications • = type is described via the applicable operations
➡ deduce results via rewrite rules Logic-Based (Model-based) Specifications
• = logic declarations state propositions that must be true State-Based Specifications
• = Specify acceptable message sequences by means of state machine
Are we building the system right? • Makes verification easier
➡ generation of test cases ➡ deduction of contractual obligations
Traceability? • Extra intermediate representation may hinder traceability
12. Conclusion
Software Architecture
!11
Software Architecture • = Components & Connectors describing high-level view of a system. • Decomposition implies trade-offs expressed via coupling and cohesion. • Proven solutions to recurring problems are recorded as patterns.
Are we building the system right? • For the non-functional parts of the requirements
Traceability? • Extra level of abstraction may hinder traceability
12. Conclusion
Quality Control • = include checkpoints in the process to verify quality attributes • Formal technical reviews are very effective and cost effective!
Quality Standards (ISO9000 and CMM) • = Checklists to verify whether a quality system may be certified
Are we building the system right? Are we building the right system?
• Quality Control eliminates coincidence. Traceability?
• Only when part of the quality plan/system
Quality Control
!12
Project Concern = Deliver on time and within budget
External (and Internal) Product Attributes
Process Attributes
12. Conclusion
Software Metrics
!13
Effort and Cost Estimation • = measure early products to estimate costs of later products • algorithmic cost modeling, i.e. estimate based on previous experience
Correctness? • Algorithmic cost modeling provides reliable estimates (incl. risk factor)
Traceability? • Quantification of estimates allows for negotiations
Quality Assurance • = quantify the quality model • Via internal and external product metrics
Correctness & Traceability? • Software metrics are too premature too assure reliable assessment
12. Conclusion
Refactoring
!14
Refactoring Operation • = Behaviour-preserving program transformation • e.g., rename, move methods and attributes up and down in the
hierarchy Refactoring Process
• = Improve internal structure without altering external behaviour Code Smell
• = Symptom of a not so good internal structure • e.g, complex conditionals, duplicated code
Are we building the system right? • Behaviour preserving ⇒ as right as it was before (cfr. tests)
Are we building the right system? • Improve internal structure ⇒ cope with requirements mismatches.
Traceability? • Renaming may help to maintain naming conventions • Refactoring makes it (too) easy to alter the code without changing the
documentation
12. Conclusion
Overview • 1.Introduction • 2.Project Management • 3.Use Cases • 4.Domain Modeling • 5.Testing • 6.Design by Contract • 7.Formal Specification • 8.Software Architecture • 9.Quality Control • 10.Software Metrics • 11.Refactoring
Articles • The Quest for the Silver Bullet • The Case of the Killer Robot
Professional Ethics • Cases
The future: Software Engineering Tools
CHAPTER 12 – Conclusion
!15 12. Conclusion
Assignment: Study an Article of your Choice
!16
➡Find and read both of the following articles.Pick the one you liked the most, study it carefully and compare the article withthe course contents.
The Quest for the Silver Bullet • [Broo87] Frederick P. Brooks, Jr. “No Silver Bullet: Incidents and Accidents in Software
Engineering” IEEE Computer, April 1987. • See also [Broo95] Frederick P. Brooks, Jr. “The Mythical Man-Month (20th anniversary
edition)” Addison-Wesley. ➡The article is more than 15 years old. Yet, it succeeds in explaining why there will
never be an easy solution for solving the problems involved in building large and complex software systems.
The Killer Robot Case • [Epst94] Richard G. Epstein, "The use of computer ethics scenarios in software
engineering education: the case of the killer robot.", Software Engineering Education: Proceedings of the 7th SEI CSEE Conference
➡The article is a faked series of newspaper articles concerning a robot which killed its operators due to a software fault. The series of articles conveys the different viewpoints one might have concerning the production of quality software.
12. Conclusion
Software Engineering & Society
!17
Your personal future isat stake (e.g., Y2K lawsuits)
Huge amounts of money are at stake (e.g., Ariane V crash)
Lives are at stake (e.g., automatic pilot)
Corporate success or failure is at stake (e.g., telephone billing, VTM launching 2nd channel)
Software became Ubiquitous Our society is vulnerable! ⇒ Deontology, Licensing, …
12. Conclusion
Code of Ethics
!18
• Software Engineering Code of Ethics and Professional Practice + ACM-site: http://www.acm.org/serving/se/code.htm + IEEE-site: http://computer.org/tab/swecc/code.htm
• Recommended by + IEEE-CS (Institute of Electrical and Electronics Engineers -
Computer Society) + ACM (Association for Computing Machinery)
• “Software Engineering Code of Ethics is Approved”, Don Gotterbarn, Keith Miller, Simon Rogerson, Communications of the ACM, October 1999, Vol42, no. 10, pages 102-107.
+ Announces the revised 5.2 version of the Code
• “Using the New ACM Code of Ethics in Decision Making”, Ronald E. Anderson, Deborah G. Johnson, Donald Gotterbarn, Judith Perrolle, Communications of the ACM, February 1993, Vol36, no. 2, pages 98-104.
+ Discusses 9 cases of situations you might encounter and how (an older version of) the code address them
12. Conclusion
Code of Ethics: 8 Principles• ACM-site: http://www.acm.org/serving/se/code.htm • IEEE-site: http://computer.org/tab/swecc/code.htm
1. PUBLIC • Software engineers shall act consistently with the public interest.
2. CLIENT AND EMPLOYER • Software engineers shall act in a manner that is in the best interests of their client and
employer consistent with the public interest. 3. PRODUCT
• Software engineers shall ensure that their products and related modifications meet the highest professional standards possible.
4. JUDGMENT • Software engineers shall maintain integrity and independence in their professional
judgment. 5. MANAGEMENT
• Software engineering managers and leaders shall subscribe to and promote an ethical approach to the management of software development and maintenance.
6. PROFESSION • Software engineers shall advance the integrity and reputation of the profession
consistent with the public interest. 7. COLLEAGUES
• Software engineers shall be fair to and supportive of their colleagues. 8. SELF
• Software engineers shall participate in lifelong learning regarding the practice of their profession and shall promote an ethical approach to the practice of the profession.
!19 12. Conclusion
Case: Privacy - Description
!20
Case Description • You consult a company concerning a database for personnel
management. • Database will include sensitive data: performance evaluations, medical
data. • System costs too much and company wants to cut back in security.
What does the code say? • 1.03. Approve software only if they have a well-founded belief that it is
safe, meets specifications, passes appropriate tests, and does not diminish quality of life, diminish privacy or harm the environment. The ultimate effect of the work should be to the public good.
• 3.12. Work to develop software and related documents that respect the privacy of those who will be affected by that software.
➡ Situation is unacceptable.
12. Conclusion
Case study: Privacy - Solution
!21
Applicable Clauses • 1.02. Moderate the interests of the software engineer, the employer, the
client and the users with the public good. • 1.04. Disclose to appropriate persons or authorities any actual or
potential danger to the user, the public, or the environment, that they reasonably believe to be associated with software or related documents.
• 2.07. Identify, document, and report significant issues of social concern, of which they are aware, in software or related documents, to the employer or the client.
• 6.09. Ensure that clients, employers, and supervisors know of the software engineer's commitment to this Code of ethics, and the subsequent ramifications of such commitment.
Actions • Try to convince management to keep high security standards. • Include in contract a clause to cancel contract when against the code of
ethics. • Alarm other institutions if you later hear that others accepted the
contract.
12. Conclusion
Case: UnreliabilityCase Description
• You’re the team leader of a team building software for calculating taxes. • Your team and your boss are aware that the system contains a lot of
defects. Consequently you state that the product can’t be shipped in its current form.
• Your boss ships the product anyway, with a disclaimer “Company X is not responsible for errors resulting from the use of this program”.
What does the code say? • 1.03. Approve software only if they have a well-founded belief that it is
safe, meets specifications, passes appropriate tests, and does not diminish quality of life, diminish privacy or harm the environment. The ultimate effect of the work should be to the public good.
• 5.11. Not ask a software engineer to do anything inconsistent with this Code.
• 5.12. Not punish anyone for expressing ethical concerns about a project.
➡ Disclaimer does not apply: can only be made in “good conscience”.
➡ In court you can not be held liable.
!22
12. Conclusion
Overview • 1.Introduction • 2.Project Management • 3.Use Cases • 4.Domain Modeling • 5.Testing • 6.Design by Contract • 7.Formal Specification • 8.Software Architecture • 9.Quality Control • 10.Software Metrics • 11.Refactoring
Articles • The Quest for the Silver Bullet • The Case of the Killer Robot
Professional Ethics • Cases
The future: Software Engineering Tools
CHAPTER 12 – Conclusion
!23 Agile Quality Assurance
Introduction • Reliability vs. Agility
Mining Software Repositories
• Tests (= visualisation) + How good was our testing process?
• Bugs (= text mining) + Who should fix this bug? + How long will it take to fix this bug? + What is the severity of this bug?
• Expertise (= social network analysis) + Who are the key personalities? + Who can help me with this file? + Where should we focus our (regression) tests?
Conclusion • The hype-cycle
The Future — Software Engineering Tools
!24
12. Conclusion
Innovation
!25
Business Models
1971 — Starbucks (seattle)
(Vienna) 1529 — European coffee house
1475 — Kiva Han coffee house (Constantinople)
Underlying Technology
1946 — commerical piston espresso machine
1908 — patent on paper filter
2001 — senseo2000 — nespresso
Technolo
gy changes
every
20 years
…
Underlyin
g business
models
rarel
y change!
12. Conclusion
Innovation in ICT
!26
Embedded
Internet
ENIAC, 1945 IBM PC, 1981 iPad, 2010NEC ultralite, 1989
Underlying
Technology
Technolo
gy changes
every
5 years
…
Underlyin
g business
models
change o
ften!
12. Conclusion
Market pressure in ICT
!27
RELIABILITY AGILITY
Measure of innovation • # products in portfolio younger than 5 years
+ in ICT usually more than 1/2 the portfolio
Significant investment in R&D • more products … faster
12. Conclusion
Reliability vs. Agility
!28
Software is vital to our society ⇒ Software must be reliable
Traditional Software Engineering Reliable = Software without bugs
Today’s Software Engineering Reliable = Easy to Adapt
Striving for RELIABILITY
(Optimise for perfection)
Striving for AGILITY
(Optimise fordevelopment speed)
On the Originof Species
12. Conclusion
Software Evolution
!29
It is not age that turns a piece of software into a legacy system, but the rate at which it has been developed and adapted without being reengineered.
[Deme02] Demeyer, Ducasse and Nierstrasz: Object-Oriented Reengineering Patterns
Components are very brittle … After a while one inevitably resorts to glue :)
Take action here … … instead ofwaiting until here.
12. Conclusion
Software Repositories & Archives
!30
Version Control • CVS, Subversion, … • Rational ClearCase • Perforce, • Visual Source Safe • …
Issue Tracking • Bugzilla • BugTracker.NET • ClearQuest • JIRA • Mant • Visual Studio Team Foundation
Server • …
Automate the Build • make • Ant, Maven • MSBuild • OpenMake • Build Forge • …
Automated Testing • HP QuickTest Professional • IBM Rational Functional Tester • Maveryx • Selenium • TestComplete • Visual Studio Test Professional
Microsoft 2010 • …
… mailing archives, newsgroups, chat-boxes, facebook, twitter, …
All of a sudden empirical research has what any empirical science needs: a large corpus of objects to analyze.
[Bertrand Meyer's technology blog]
12. Conclusion
Mining Software Repositories
!31
Conferences • 2017—14th edition, Buenos Aires,
Argentina
• 2016—13th edition, Austin, Texas • 2015—12th edition, Florence, Italy • 2014—11th edition, Hyderabad, India • 2013—10th edition, San Francisco, CA,
USA • 2012—9th edition, Zürich, CH • 2011—8th edition, Honolulu, HI, USA • 2010—7th edition, Cape Town, ZAF • 2009—6th edition, Vancouver, CAN • 2008—5th edition, Leipzig, DEU • 2007—4th edition, Minneapolis, MN, USA • 2006—3rd edition, Shanghai, CHN • 2005—2nd edition, Saint Luis, MO, USA • 2004—1st edition, Edinburgh, UK
The Mining Software Repositories (MSR) field analyzes the rich data available in software repositories to uncover interesting and actionable information about software systems and projects.
Hall of Fame—Mining Challenge Winners • 2017 — TravisTorrent (Github) • 2016 — BOA (SourceForge & Github) • 2015 — StackOverflow • 2014—Sentiment Analysis of Commit
Messages in GitHub: An Empirical Study • 2013—Encouraging User Behaviour with
Achievements: An Empirical Study [StackOverflow]
• 2012—Do the Stars Align? Multidimensional Analysis of Android's Layered Architecture
• 2011—Apples Vs. Oranges? An exploration of the challenges of comparing the source code of two software systems [Netbeans+Eclipse]
• 2010—Cloning and Copying between GNOME Projects
• 2009—On the use of Internet Relay Chat (IRC) meeting by developers of the GNOME GTK+ project
• 2008—A newbie's guide to Eclipse APIs • 2007—Mining Eclipse Developer Contributions
via Author-Topic Models • 2006—A study of the contributors of
PostgreSQL
Agile Quality Assurance
The Future — Software Engineering Tools
!32
Introduction • Reliability vs. Agility
Mining Software Repositories
• Tests (= visualisation) + How good was our testing process?
• Bugs (= text mining) + Who should fix this bug? + How long will it take to fix this bug? + What is the severity of this bug?
• Expertise (= social network analysis) + Who are the key personalities? + Who can help me with this file? + Where should we focus our (regression) tests?
Conclusion • The hype-cycle
12. Conclusion
Test Monitor — Change History
!33
0��
0��
http://swerl.tudelft.nl/bin/view/Main/TestHistory Case = Checkstyle
Single Test Unit Testing
Integration tests
Phased Testing
Steady Testing & Coding
12. Conclusion
Test Monitor — Growth History
!34
Few Tests
Steady Testing & Coding
Increased Test Activity
12. Conclusion
Test Monitor — Coverage Evolution
!35
Sudden Drop!?
Good Class Coverage
Agile Quality Assurance
The Future — Software Engineering Tools
!36
Introduction • Reliability vs. Agility
Mining Software Repositories
• Tests (= visualisation) + How good was our testing process?
• Bugs (= text mining) + Who should fix this bug? + How long will it take to fix this bug? + What is the severity of this bug?
• Expertise (= social network analysis) + Who are the key personalities? + Who can help me with this file? + Where should we focus our (regression) tests? + Stack Overflow
Conclusion • The hype-cycle
Description ⇒ text Mining
Stack Traces ⇒ Link to source code
Product/Component Specific vocabulary
Suggestions?
12. Conclusion
Bug Database
------------------------------------
------------------------------------Bug Reports
------------------------
(1) & (2)Extract and preprocess bug reports
!38
prediction
• true positive • false positive • true negative • false negative ⇒ precision & recall
Text Mining
New report---------------
(4) Predict • severity • assigned-to • estimated time
(3) Training predictor
12. Conclusion
Results
!39
Question Cases Precision Recall
Who should fix this bug? Eclipse, Firefox, gcc
eclipse: 57% firefox: 64%
gcc: 6%
—
How long will it take to fix this bug? JBoss
depends on the component many similar reports: off by one hour
few similar reports: off by 7 hours
What is the severity of this bug? Mozilla, Eclipse, Gnome
mozilla, eclipse:67% - 73%
gnome: 75%-82%
mozilla, eclipse:50% - 75%
gnome: 68%-84%
Promising results but … •how much training is needed? •how reliable is the data?
(estimates, severity, assigned-to) •does this generalize? (on industrial scale?)
⇒ replication is needed
Agile Quality Assurance
Table Of Contents
!40
Introduction • Reliability vs. Agility
Mining Software Repositories
• Tests (= visualisation) + How good was our testing process?
• Bugs (= text mining) + Who should fix this bug? + How long will it take to fix this bug? + What is the severity of this bug?
• Expertise (= social network analysis) + Who are the key personalities? + Who can help me with this file? + Where should we focus our (regression) tests? + Stack Overflow
Conclusion • The hype-cycle
Space
© Inspired by Margaret-Anne (Peggy) Storey; Keynote for MSR 2012, Zurich, Switzerland
Place
12. Conclusion
270 M. Pinzger and H.C. Gall
Fig. 13.1 Communication paths extracted from the Eclipse Platform Core developer mailing list
way. Figure 13.1 shows an excerpt of a mail thread from the online mailing listarchive of Eclipse Platform Core.
A mail addressed to a mailing list is processed by the reflector and sent to allsubscribers. This means that the To: address is always the mailing list addressitself; hence, there is no explicit receiver address. In our example this address isplatform-core-dev. To model the communication path between sender and receiver,the receiver needs to be reconstructed from subsequent answering mails. The iden-tification of the sender is given by the From: field which is denoted by the name onthe right side of a message. For determining the receivers of emails we analyze thetree structure of a mail thread and compute the To: and Cc: paths.
Figure 13.1 illustrates the two paths in our example thread whereas gray arrowsdenote the To: path and light gray arrows the Cc: path. A gray arrow is establishedbetween an initial mail and its replies. For example, Philippe Ombredanne is firstreplying to the mail of Thomas Watson, so in this case Philippe Ombredanne is thesender and Thomas Watson is the receiver of the mail. To derive Cc: receivers weconsider the person answering a mail as an intended receiver of this mail. In case thisperson is already the To: receiver (as it applies with the mails number 3–5 betweenBob Foster and Pascal Rapicault) no additional path is derived, because we assumethat a mail is not sent to a person twice.
For importing the data from the mailing lists archives we extended the iQuesttool. iQuest is part of TeCFlow,1 a set of tools to visualize the temporal evolutionof communication patterns among groups of people. It contains a component toparse mailing lists and import them into a MySQL database. Our extension aims atincluding the follow-up information of mails to fully reconstruct the structure of amail-thread. The sample thread shown above consists of 15 mails that result in 25communication paths.
13.3.1.2 Deriving Communication Paths from Bug Reports
The second source outlined for modeling communication paths is a bug trackingrepository, such as, Bugzilla. Bugzilla users create reports and comments and give
1 http://www.ickn.org/ickndemo/
13 Dynamic Analysis of Communication and Collaboration 273
Fig. 13.2 Integrated model for representing communication and collaboration data in OSSprojects
person. In every other case the person is assumed to be unknown and a new personentity is added to the database.
While this algorithm works fine for person information obtained from Bugzillaand mailing lists, there are problems with matching persons obtained from CVS logdata. Typically, the author stored in CVS logs indicates the CVS user name, but notthe real name of a person. Because of the high number of false matches, the mappingof these persons is done manually.
In addition to the information of a person, email addresses contain domain infor-mation that, for example, denotes the business unit of a developer. We use thisinformation to assign developers to teams. We obtain email addresses that havebeen generated with MHonArc.2 The problem is that MHonArc provides a spammode which deters spam address harvesters by hiding the domain information ofemail addresses. For example, the email address of Chris McGee is displayed as
2 http://www.mhonarc.org/
Construct Social Network
!42
Mailing lists + Bug reports
+ CVS Logs⇒
12. Conclusion
Code Ownership (& Alien Commits)
!43
13 Dynamic Analysis of Communication and Collaboration 279
Fig. 13.7 Collaboration in the Eclipse Platform Core project observed in the time from 14th to28th February 2005
information ideally gets processed. Figure 13.8 illustrates the communication viathe developer mailing list and Bugzilla data over 21 months. The amount of com-munication (i.e., the number of communication paths reconstructed from bug reportsand emails) is illustrated by the width of edges. The wider the edges of a person’snode are, the more this person communicated with other developers.
The graph in Fig. 13.8 shows the core development team whose members fre-quently communicate with each other. Rafael Chaves, Dj Houghton, Jeff McafferThomas Watson, John Arthorne, and Pascal Rapicault form the core team. They arethe communicators who keep the network together and play an important role withinthe project. Interesting is that they all belong to either the group @ca.ibm.com [email protected] as indicated by the shadows of rectangles representing these devel-opers. Another highly connected group is formed by the Swiss team (@ch.ibm.ch)whose members are represented by the nodes on the right side of the graph. Almosteach developer of the Swiss team is in touch with the US team; however, MarkusKeller and Daniel Megert turn out as the main communicators between the twoteams during that time.
Another interesting finding concerns the environment via which the developerscommunicated. Most of the communication was via Bugzilla bug reports indicatedby the gray edges. Only the core team also used the mailing list to discuss Eclipse
Code Owners
Alien Commit
12. Conclusion
Key personalities
!44
280 M. Pinzger and H.C. Gall
Fig. 13.8 Communicators of the Eclipse Platform Core project as from May 2004 to February2006
Platform Core relevant issues. Such findings are of particular interest when newways of communication are considered.
13.5.4 Project Dynamics
Newcomers should be integrated fast into development teams to rapidly increaseproductivity and foster synergy among team members. With STNA-Cockpit theproject manager can observe how newcomers actually are integrated into theirteams. For this, the project manager selects the starting observation period anduses the time-navigation facility of STNA-Cockpit to investigate the evolution ofthe communication and collaboration network over time. The graph animationallows the project manager to observe how the newcomer behaves concerningcommunication and collaboration with other team members. In particular, shelooks for communication paths that tell her the newcomer gets actively involved
communicators
“swiss group”
12. Conclusion
Socialization
!45
13 Dynamic Analysis of Communication and Collaboration 281
(b) 2nd half of April 2004(a) 1st half of April 2004
(d) 2nd half of June 2004(c) 1st half of June 2004
Dj Houghton
John Arthorne
Rafael Chaves Jeff Mcaffer
Pascal Rapicault
Kevin Barnes
org.eclip ources
org.eclip untime
org.eclipse.co unt e.compatibility
se.core.res
se.core.r
re.r im
Rafael Chaves
Daniel Megert
John Arthorne
Dj Houghton
Da anson
Jared Burns
Eric Gamma
Kevin Barnes
org.eclipse.core.resources
rin Swh
Pascal Rapicault
Dj Houghton
Sonia Dimitrov
John ArthorneThomas Watson
Jeff Mcaffer
Dirk Baeumer
Rafael Chaves
Erich Gamma
Darin Wright
Kevin Barnes
Darin Swanson
org.eclip co esources
org.eclip co .runtime
se. re.r
se. re
John Arthorne
Rafael Chaves
Darin Wright
Darin Swanson
Kevin Barnes
org.eclip
Luc Bourlier
se.co ariablesre.v
Fig. 13.9 Socialization of Kevin Barness in the Eclipse Platform Core project
in discussions on the developers mailing lists and bug reports. In addition, sheobserves whether the newcomer contributes to the plug-ins and finally takes overresponsibility of portions of the source code.
Consider the following scenario in which Kevin Barness is entering the USteam @ca.ibm.com of the Eclipse Platform Core project in April 2004. Figure 13.9depicts various snapshots taken from the network created for subsequent points intime. Kevin Barness is starting as a developer in the Eclipse Platform Core team atthe beginning of April 2004. His first action is to get in touch with some key person-alities of the project, namely Rafael Chaves and John Arthorne. His first contacts arevisualized by the graphs depicted by (Fig. 13.9a, b). In the following weeks he com-municates also with other project members to get more involved into the project (seeFig. 13.9c), namely Darin Wright and Darin Swanson. As (Fig. 13.9d) illustrates,Darin Wright is a developer and Darin Swanson the owner of the files that are goingto be modified by Kevin. Rafael Chaves seems to play the role of the connector whointroduces the new developer Kevin Barness to the responsible persons. Accordingto the graph, he is communicating with two senior developers.
Socialization of Kevin Barness in the Eclipse Platform Core project
12. Conclusion
Expertise Browser
!46
Used within a geographically dispersed team •120 developers at two sites (Germany and England)
grew to 250 developers (incl. satellite site in France) • satellite teams: locate expertise • established teams: who is doing what?
12. Conclusion
Code Ownership vs. Code Quality
!47
3
Ownership of A.dll by Developers
(a) A.dll
3
Ownership of B.dll by Developers
(b) B.dll
Figure 2: Ownership graphs for two binaries in Windows
of experience used by Mockus and Weiss also used the num-ber of changes. However, prior literature [14] has shown highcorrelation (above 0.9) between number of changes and num-ber of lines contributed and we have found similar results inWindows, indicating that our results would not change sig-nificantly. With these terms defined, we now introduce ourmetrics.
• Minor – number of minor contributors
• Major – number of major contributors
• Total – total number of contributors
• Ownership – proportion of ownership for the contrib-utor with the highest proportion of ownership
Figure 1 shows the proportion of commits for each of thedevelopers that contributed to abocomp.dll in Windows, indecreasing order. This library had a total of 918 commitsmade during the development cycle. The top contributingengineer made 379 commits, roughly 41%. Five engineersmade at least 5% of the commits (at least 46 commits).Twelve engineers made less than 5% of the commits (lessthan 46 commits). Finally, there were a total of seventeenengineers that made commits to the binary. Thus, our met-rics for abocomp.dll are:
Metric Value
Minor 12Major 5Total 17Ownership 0.41
4. HYPOTHESESWe begin with the observation that a developer with lower
expertise is more likely to introduce bugs into the code. Adeveloper who has made a small proportion of the commitsto a binary likely has less expertise and is more likely tomake an error. We expect that as the number of develop-ers working on a component increases, the component maybecome “fragmented” and the di⇤culty of vetting and coor-dinating all these minor contributions becomes an obstacleto good quality. Thus if Minor is high, quality su�ers.
Hypothesis 1 - Software components with many minor con-tributors will have more failures than software componentsthat have fewer.
We also look at the proportion of ownership for the highestcontributing developer for each component (Ownership). IfOwnership is high, that indicates that there is one devel-oper who “owns” the component and has a high level of ex-pertise. This person can also act as a single point of contactfor others who need to use the component, need changes toit, or just have questions about it. We theorize that whensuch a person exists, the software quality is higher resultingin fewer failures.
Hypothesis 2 - Software components with a high level ofownership will have fewer failures than components withlower top ownership levels.
If the number of minor contributors negatively a�ects soft-ware quality, the next question to ask is, “Why do somebinaries have so many minor contributors?” We have ob-served both at Microsoft and also within OSS projects suchas Python and Postgres, that during the process of mainte-nance, feature addition, or bug fixing, owners of one compo-nent often need to modify other components that the firstrelies on or is relied upon by. As a simple example, a devel-oper tasked with fixing media playback in Internet Explorermay need to make changes to the media playback interface li-brary even though the developer is not the designated ownerand has limited experience with this component. This leadsto our hypothesis.
Hypothesis 3 - Minor contributors to components will beMajor contributors to other components that are relatedthrough dependency relationships
Finally, if low-expertise contributions do have a large im-pact on software quality, then we expect that defect predic-tion techniques will be a�ected by their inclusion or removal.We therefore replicate prior defect prediction techniques andcompare results when using all data, data derived only fromchanges by minor contributors and, and data derived onlyfrom changes to major contributors. We expect that whendata from minor contributors is removed, the quality of thedefect prediction will su�er.
3
Ownership of A.dll by Developers
(a) A.dll
3
Ownership of B.dll by Developers
(b) B.dll
Figure 2: Ownership graphs for two binaries in Windows
of experience used by Mockus and Weiss also used the num-ber of changes. However, prior literature [14] has shown highcorrelation (above 0.9) between number of changes and num-ber of lines contributed and we have found similar results inWindows, indicating that our results would not change sig-nificantly. With these terms defined, we now introduce ourmetrics.
• Minor – number of minor contributors
• Major – number of major contributors
• Total – total number of contributors
• Ownership – proportion of ownership for the contrib-utor with the highest proportion of ownership
Figure 1 shows the proportion of commits for each of thedevelopers that contributed to abocomp.dll in Windows, indecreasing order. This library had a total of 918 commitsmade during the development cycle. The top contributingengineer made 379 commits, roughly 41%. Five engineersmade at least 5% of the commits (at least 46 commits).Twelve engineers made less than 5% of the commits (lessthan 46 commits). Finally, there were a total of seventeenengineers that made commits to the binary. Thus, our met-rics for abocomp.dll are:
Metric Value
Minor 12Major 5Total 17Ownership 0.41
4. HYPOTHESESWe begin with the observation that a developer with lower
expertise is more likely to introduce bugs into the code. Adeveloper who has made a small proportion of the commitsto a binary likely has less expertise and is more likely tomake an error. We expect that as the number of develop-ers working on a component increases, the component maybecome “fragmented” and the di⇤culty of vetting and coor-dinating all these minor contributions becomes an obstacleto good quality. Thus if Minor is high, quality su�ers.
Hypothesis 1 - Software components with many minor con-tributors will have more failures than software componentsthat have fewer.
We also look at the proportion of ownership for the highestcontributing developer for each component (Ownership). IfOwnership is high, that indicates that there is one devel-oper who “owns” the component and has a high level of ex-pertise. This person can also act as a single point of contactfor others who need to use the component, need changes toit, or just have questions about it. We theorize that whensuch a person exists, the software quality is higher resultingin fewer failures.
Hypothesis 2 - Software components with a high level ofownership will have fewer failures than components withlower top ownership levels.
If the number of minor contributors negatively a�ects soft-ware quality, the next question to ask is, “Why do somebinaries have so many minor contributors?” We have ob-served both at Microsoft and also within OSS projects suchas Python and Postgres, that during the process of mainte-nance, feature addition, or bug fixing, owners of one compo-nent often need to modify other components that the firstrelies on or is relied upon by. As a simple example, a devel-oper tasked with fixing media playback in Internet Explorermay need to make changes to the media playback interface li-brary even though the developer is not the designated ownerand has limited experience with this component. This leadsto our hypothesis.
Hypothesis 3 - Minor contributors to components will beMajor contributors to other components that are relatedthrough dependency relationships
Finally, if low-expertise contributions do have a large im-pact on software quality, then we expect that defect predic-tion techniques will be a�ected by their inclusion or removal.We therefore replicate prior defect prediction techniques andcompare results when using all data, data derived only fromchanges by minor contributors and, and data derived onlyfrom changes to major contributors. We expect that whendata from minor contributors is removed, the quality of thedefect prediction will su�er.
Software components with many minor contributors will have more failures than software components that have fewer.
Software components with a high level of ownership will have fewer failures than components with lower top ownership levels.
Data from Windows Vista and Windows 7
Q&A support
In CHI2016 ProceedingsAgile Quality Assurance
Introduction • Reliability vs. Agility
Mining Software Repositories
• Tests (= visualisation) + How good was our testing process?
• Bugs (= text mining) + Who should fix this bug? + How long will it take to fix this bug? + What is the severity of this bug?
• Expertise (= social network analysis) + Who are the key personalities? + Who can help me with this file? + Where should we focus our (regression) tests? + Stack Overflow
Conclusion • The hype-cycle
Table Of Contents
!50
12. Conclusion
Hype Cycle
!51
Hype Cycle © Gartner
Vis
ibili
ty
Maturity
Technology Trigger
Peak of Inflated
Expectations
Trough ofDisillusionment
Slope ofEnlightenment
Plateau ofProductivity
12. Conclusion
The Future?
!52
Personal Opinion
Peak of Inflated
Expectations
Hype Cycle © Gartner
Vis
ibili
ty
Maturity
Technology Trigger
Trough ofDisillusionment
Slope ofEnlightenment
Plateau ofProductivity
IBM (Patents ) ⇒ Eclipse
Microsoft Research⇒ Team Foundation Server
12. Conclusion
Summary (i)You should know the answers to these questions
• Name 3 items from the code of ethics and provide a one-line explanation. • If you are an independent consultant, how can you ensure that you will not have to act
against the code of ethics? • What would be a possible metric for measuring the amount of innovation of a
manufacturing company? • What can we do to avoid that glue-code erodes our component architecture? • Explain in 3 sentences how a software tool could recommendation on certain fields in a
bug report (severity, assigned-to, estimated time). • Which components would be more susceptible to defects: the ones with a high level of
ownership or the ones with a low level of ownership? When you chose the “No Silver Bullet” paper
• What’s the distinction between essential and accidental complexity? • Name 3 reasons why the building of software is essentially a hard task? Provide a one-
line explanation. • Why is “object-oriented programming” no silver bullet? • Why is “program verification” no silver bullet? • Why are “components” a potential silver bullet?
When you chose the “Killer Robot” paper • Which regression tests would you have written to prevent the “killer robot”? • Was code reviewing applied as part of the QA process? Why (not)? • Why was the waterfall process disastrous in this particular case? • Why was the user-interface design flawed?
!53 12. Conclusion
Summary (ii)Can you answer the following questions?
• You are an experienced designer and you heard that the sales people earn more money than you do. You want to ask your boss for a salary-increase; how would you argue your case?
• Software products are usually released with a disclaimer like “Company X is not responsible for errors resulting from the use of this program”. Does this mean that you shouldn’t test your software? Motivate your answer.
• Your are a QA manager and are requested to produce a monthly report about the quality of the test process. How would you do that?
• Give 3 scenarios to illustrate why it would be interesting to know which developers are most experienced with a give piece of code.
• It has been demonstrated that it is feasible to make reliable recommendations concerning fields in a bug report. What is still needed before such a recommendation tool can be incorporated into state-of-the art tools (bugzilla, jira, …)
When you chose the “No Silver Bullet” paper • Explain why incremental development is a promising attack on conceptual essence. Give
examples from the different topics addressed in the course. • “Software components” are said to be a promising attack on conceptual essence. Which
techniques in the course are applicable? Which techniques aren’t? When you chose the “Killer Robot” paper
• Recount the story of the Killer Robot case. List the three most important causes for the failure and argue why you think these are the most important.
!54