29
Empirical Research on Software Analysis and Design with UML Bente Anda Research scientist, Simula Research Laboratory Associate professor II, IFI 6.11.2007

Empirical Research on Software Analysis and Design with UML Bente Anda Research scientist, Simula Research Laboratory Associate professor II, IFI 6.11.2007

Embed Size (px)

Citation preview

Empirical Research on Software Analysis and Design with UML

Bente Anda

Research scientist, Simula Research Laboratory

Associate professor II, IFI

6.11.2007

6/11 – 07 Bente Anda 2

Motivation

• Software analysis and design using UML is

– taught in software engineering courses,

– advocated as a means to ensure quality in software development projects, and

– frequently used in some form in software development projects.

6/11 – 07 Bente Anda 3

Advocates of Methods and Toolsclaim benefits, examples: “UML-based model-driven development tools like xxx speeds up development, testing, debugging and documentation of embedded designs.”

“Model-driven development methods using UML have already demonstrated their potential for radical improvements in the quality of software and the productivity of development.”

http://techrepublic.com.com/

“Developers who understand UML can communicate precisely and effectively with one another about the structures of object oriented systems, minimizing the chances of misunderstandings.  New developers to a team who understand UML can quickly grasp the structure of a software application when UML analysis and design documentation is available for it.  UML helps to standardize your software development process and make it easier for new developers trained in UML to come up to speed with your application's architecture.”

http://www.mblmsoftware.com/UMLatMBLM.aspx

6/11 – 07 Bente Anda 4

Empirical Research on UMLCurrent situation:

• Relatively few empirical studies on UML-based development have been reported.

• Consequently, it is difficult to summarize such studies or to do a meta-analysis to provide strong empirical evidence of the effects of the technique.

• It is also difficult to identify studies that are relevant for a specific project context.

• There are, however, individual studies that may provide good advice.

Ideal situation:

• Sufficient empirical evidence to, given a specific project, be able to make decisions about how to best use UML-based development

– to satisfy project goals, for example regarding quality and time, and

– according to project constraints, for example project members’ qualifications and turn over.

Many more empirical studies are needed in this area.

6/11 – 07 Bente Anda 5

Content of (rest of) Lecture• Empirical evidence on

– how UML is used in industry (2 surveys),

– how to best use UML in software projects• Describing use cases (experiment)

– effects of UML • The impact of UML documentation on software

maintenance (A series of experiments + some results from a case study)

• Learning goal(s)

– Understand how different aspects of the use of UML in software development can be investigated, and

– understand the kinds of evidence-based decisions that can be made about UML-based development.

6/11 – 07 Bente Anda 6

Exercise

• Read through the two survey papers, and answer the following:

1. How are the studies similar and how do they differ (regarding method and results)?

2. Can you find any possible explanations for different results?

6/11 – 07 Bente Anda 7

How is UML used in Industry? 1 Dobing, B. and Parsons, J. How UML is used. Communications of the ACM,

49(5):109–113, 2006.

• Research questions

– To what extent are the UML components used and for what purposes?

– Do differences in the levels of use and the reasons for these differences reflect the complexity of the language?

– How successful is UML in facilitating communication within software development teams?

• A web based survey with questions based on literature on UML and interviews with UML users.

• The survey attracted 182 responses from analysts using UML. The respondents were members of the OMG and their contacts.

– They had been involved in an average of 27 projects (6.2 using UML), over an average 15-year career (4.7 using UML).

– The median "typical" UML project had a budget of around $1,000,000 and 6.5 person-years, and required about 50,000 lines of code.

6/11 – 07 Bente Anda 8

How is UML used in Industry? 2Grossman, M., Aronson, J.E. and McCarthy, R.V. Does UML make the grade?

Insights from the software development community. Information and Software Technology, 47(6):383-397, 2005.

• Research questions– Do individuals who use UML perceive it to be beneficial?– Does UML provide a task-technology fit to individuals who utilize it?– What are the characteristics that affect UML use?

• A web based survey with 32 questions based on a framework for understanding use of technology, the Task-Technology Fit Model

• A database with 1507 e-mail addresses of UML users was established by accessing online newsgroups, user groups, conference web sites and articles.

• The survey was sent to those e-mail addresses, 150 UML users responded, 19 responses were incomplete 131 responses.

• The respondents – were located worldwide, although most were from the US.– 57% had more than 6 years experience with OO technology, 5% had one

year or less, but over 50% had completed less than five projects in UML– They were managers, systems analysts and software developers– 21,4% of project over $ 1000 000 (the high end), and 16% less than $30000

(the low end)

6/11 – 07 Bente Anda 9

Results 1

6/11 – 07 Bente Anda 10

Results 2Main objective for using UML: Usage of diagrams:

Use of UML in organization:

Frequency Percent

Capture and communicate requirements 74 56.5

Guide development of code 39 29.8

Reverse engineering 9.9 13

Other 5 3.8

Frequency Percent

No response 1 0.8

Other 4 10.7

Used consistently on all development projects

36 27.5

Used for large projects only 22 16.8

Used sporadically 58 44.3

6/11 – 07 Bente Anda 11

Paper 1:

• UML may be too complex for many developers and projects– Avoid collaboration (communication) diagrams in the first UML projects– Statechart (state machine) diagrams are very useful for their intended

purpose but often not critical. – Focusing on a smaller set of components may be a good strategy in the early

stages of learning UML, and may reduce the cost of ensuring consistency across different components.

• Quite high client/user involvement in developing and reviewing UML components, but more attention may be needed on how clients/users can be involved in the use of UML beyond Use case descriptions.

• Those with the most UML experience reported their that projects used more components, suggesting that usage levels might increase as practitioners gain experience.

Paper 2:

• Did not really manage to answer the research questions, maybe beacuse the UML -users do not yet have a good understanding of how this technology fits the tasks they are trying to perform.

Conclusions

6/11 – 07 Bente Anda 12

Metode

• Utvikling av spørreskjemaer

– Paper 1 har basert spørsmålene sine på generelle påstander om UML

– Paper 2 har basert spørsmålene på et spesifikt rammeverk, task-technology fit.

• Respondenter

– Paper 1 gikk ut til en relativt snever målgruppe, medlemmer av OMG og deres kontakter, men vi vet ingenting om response-rate.

– Paper 2 gikk bredt ut til UML newsgrupper, UML relaterte konferanser m.m, men fikk relativt få svar.

Resultater

– Paper 1 svarer på sine forskningsspørsmål og gir ganske presise svar

• lettere å få svar når spørsmål er basert på generelle påstander…

• kanskje en snevrere og dyktigere målgruppe, men vi har bare info. om gjennomsnittlig erfaring med UML, og vi vet heller ikke noe om respons rate…

– Paper 2 ser ikke ut til å konkludere på forskningsspørsmålene,og de tar heller ikke opp hvilke oppgaver UML skal være egnet til, dvs. hvilken task teknologien skal passe til…

– Bruk av UML er ganske lik for de mest brukte UML komponentene som use cases, class diagrams og sequence diagrams, mens activity diagrams, state chart (state machine) diagrams and collaboration (communication diagrams) brukes sjeldnere i Paper 2.

– Har dette med kunnskapsrike brukere i Paper 1, dette er jo også påstanden i Paper 1 når det gjelder bruk…

– På den annen side Paper 1 fant at UML var for komplekst, men Paper 2 fant egentlig ikke det

Sammenligning av artiklene

6/11 – 07 Bente Anda 13

How to best Describe Use Cases*Motivation

– Use cases are used for capturing and describing functional requirements.

– Use cases are used in the development process, for example in planning, design and test, and as a means of communication among stakeholders in development projects.

– An example of a guideline on use case modelling in a major Scandinavian telecommunication company:

“The use case specification document must be interpreted in the same way by whoever reads it”

It is important that use cases are described in such a way that they support the development process and promote a good understanding of the requirements among stakeholders.

But, how can guidelines for description be used to achieve such quality?

*Anda, B., Sjøberg, D. and Jørgensen, M. Quality and Understandability of Use Case Models, ECOOP 2001-15th European Conference on Object-Oriented Programming, pp.402-428, 2001.

6/11 – 07 Bente Anda 14

Use Case Modelling• A use case model describes a system's intended

functions and its environment. It has two parts:

• A diagram that provides an overview of actors and use cases, and their interactions.

– An actor represents a role that the user can play with regard to the system.

– A use case represents an interaction between an actor and the system.

• The use case descriptions detail the requirements by documenting the flow of events between the actors and the system.

6/11 – 07 Bente Anda 15

Many Guidelines – Which ones to Choose?

• There are many different, sometimes contradictory, recommendations and guidelines on use case modelling.

– To our knowledge only one set of guidelines had previously been empirically evaluated1,2.

• Typical alternatives1. Minor guidelines – Simple guidelines that only give support on how

to identify actors and use cases

2. Template guidelines3 – Guidelines on the content of the description of each use case

3. Style guidelines1 – Guidelines on how to document the flow of events in each use case

1Ben Achour, C., Rolland, C., Maiden, N. and Souveyet, C. Guiding use case authoring: results from an empirical study, 4th IEEE International Symposium on Requirements Engineering, Limerick, 7-11 June, 1999.

2Cox, K. and Phalp, K. Replicating the CREWS use case authoring guidelines experiment, Empirical Software Engineering Journal, 5(3):245-268, 2000.

3Cockburn, A. Writing Effective Use Cases. Addison-Wesley, 2000.

6/11 – 07 Bente Anda 16

The Experiment• We conducted a large experiment on the effects of the three different sets of

guidelines on use case modelling as part of a project in a course in software engineering.

– Use case modelling was taught in lectures in the course.

• The 139 students in the course were divided into 31 project groups. One of the tasks in the project was to make use case models for a system to be developed.

– The three different sets of guidelines were taught to different project groups in seminars.

– The project groups were organized in pairs; each project group was customer for one system while they were development team for another.

• System A – A system for publishing questionnaires on the internet for an opinion poll company.

• System B – A system for swapping duties between nurses for a hospital.

ActorUseCase

Development team Customer team

6/11 – 07 Bente Anda 17

Template Guidelines

• A template for documenting actors

– Name

– Description

– Examples

• A template for documenting use cases

– Name

– Actors

– Trigger

– Prerequisites

– Post-conditions

– Normal flow of events

– Variations

– Associations

6/11 – 07 Bente Anda 18

Style GuidelinesSG1: Write the UC normal course as a list of discrete actions in the form: <action#><action description>.

SG2: Use the sequential ordering of action descriptions to indicate strict sequence between actions. Variations should be written in a separate section.

SG3: Iterations and concurrent actions can be expressed in the same section of the UC, whereas alternative actions should be written in a different section.

SG4: Be consistent in your use of terminology.

SG5: Use present tense and active voice when describing actions.

SG6: Avoid use of negations, adverbs and modal verbs in the description of an action.

Content

CG1: <agent><action><agent>

CG2: <agent><action><object><prepositional phrase>

CG3: ’If’ <alternative assumption> ’then’ <list of action descriptions>

CG4: ’Repeat until’ <repetiton conditon><list of action descriptions>

CG5: <action 1> ‘while’ <action 2>

6/11 – 07 Bente Anda 19

The Guidelines were Evaluated According to • How they affected the understanding of the requirements both among

the developers and the customers (readers) by comparing the scores on correct answers to a set of questions about functionality.

Examples of Questions for System B:

• Who has access to the system and how do they log on?• How is the roster made and updated, and who is responsible for it?• What possibilities are there in the system to look at rosters and who has access

to different rosters?

• How they contributed to other quality properties of the use case models.

• Actors – the correct actors were identified. • Use cases – the correct use cases were identified. • Content – the description of each use case contained the information required by

all the sets of guidelines. • Level of detail – the descriptions of each event were at an appropriate level of

detail. • Realism – the flow of events was realistic, that is, the events follow a logical and

complete sequence, and it is clearly stated where variations can occur. • Consistency – the use of terminology was consistent.

• How useful they were considered by the participants in the experiment.

6/11 – 07 Bente Anda 20

Assessment of UnderstandabilityScores on reading Const-

ructingScores on constructingType of

guidelineRead-ing

Min Med-ian

Max Std Min Med-ian

Max Std

Minorguidelines

14 2 6 11 2,6 13 5 8 12 2,4

Templateguidelines

26 5 9 12 2,1 25 4 9 12 2,5

Styleguidelines

27 1 8,5 13 2,9 28 2 9 13 2,7

Total 68 66

• There was a statistically significant difference in the score on correct answers between the customers who had read use case models constructed using either the Template or Style guidelines compared with those who had used the Minor guidelines.

• There were no statistically significant differences between the guidelines when we compared the scores of the developers on the questions about functionality in the use case models they had constructed themselves.

6/11 – 07 Bente Anda 21

Assessment of QualityType of guideline

Actors Use cases

Content Level of det.

Realism Consistency

Sum

Minor guidelines

Mid (2,3)

Mid (2,1)

Worst (1,1)

Worst(1,7)

Worst (1,7)

Mid (1,7)

Worst(11,3)

Template guidelines

Best (2,6)

Best 2,5

Best (2,5)

Best (2,2)

Best (2,4)

Mid (1,7)

Best (14,6)

Style guidelines

Worst (2,2)

Worst (1,8)

Mid (1,7)

Mid (1,9)

Mid (2,0)

Best (1,8)

Mid (12,0)

• The use case models constructed using the Template guidelines obtained the highest score on the different properties of quality.

• The Minor guidelines did worst.

• In addition, the template guidelines were found most useful (based on questions about usefullness).

6/11 – 07 Bente Anda 22

Results on Guidelines• Indication that guidelines based on templates result in use

case models that are easier to understand for the readers than are the other guidelines.

• Indication that the guidelines based on templates result in better use case models regarding also other quality attributes.

• The Style guidelines appear to improve some of the quality attributes. It may therefore be beneficial to combine template guidelines with style guidelines.

But, the effects of using use case guidelines depends on:

1. The domain knowledge of the project participants,

2. their experience with use case modelling,

3. their abilities to write unambiguous texts, and

4. the turn-over of the project members.

6/11 – 07 Bente Anda 23

The Impact of UML Documentation on Software Maintenance*

Motivation

• Model-driven development is often perceived as expensive

• Software maintenance is costly and is often performed by individuals who were not involved in the original design of the system

• Good documentation is therefore important, and

• modeling is seen as one way to handle the complexity of software

Can the use of UML documentation make a practically significant difference, that would justify the costs, to software maintenance?

* Arisholm, E., Briand, L.C., Hove, S.E. and Labiche, Y. The Impact of UML Documentation on Software Maintenance:

An Experimental Evaluation, IEEE Transactions on Software Engineering, 32(6): 365–381, 2006.

6/11 – 07 Bente Anda 24

How can UML documentation make a difference?

• By reducing the costs related to code changes, which are

1. Time to complete change tasks

2. Functional correctness of changes

3. The quality of the change’s design

• Compared to a baseline situation where the developers have

1. source code with comments to define the most complex methods and variables, and

2. a high-level textual description of the system objectives and functionality.

6/11 – 07 Bente Anda 25

• In Oslo– Subjects: 22 3rd year students who were paid for their participation

• The students were divided in two groups of 11 students based on credits in computer science (less and more than 30 credits)

• The students of each group were assigned randomly to use UML or to not use UML

– Duration: 8 hours on one day– Tool: Tau - UML

• In Ottawa – Subjects: 76 4th year students who did this as a compulsory part of a course (therefore

they should all have the same learning experience).• The students were divided in two groups of 38 students based on grades in previous course

on UML (above or below B-)• The students in both groups did some tasks with UML and some without.

– Duration: 5 laboratory sessions of 3 hours, the first one used for preparation, – Tool: Visio

• Two systems were used – ATM machine – Vending machine for drinks

• The UML documentation consisted of a use case diagram, sequence diagrams for each use case and a class diagram.

Two Experiments

6/11 – 07 Bente Anda 26

• For each subject and each task time to perform the task excluding (T) and including (T’) diagram modifications was recorded.

• The resulting solutions were assessed according to – Functional correctness (C), in the Oslo experiment graded on a six point

scale to indicate the amount of work required to fix deviations from the prescribed functionality, and in the Ottawa experiment measured as the number of passed test cases.

– design quality (Q), quantifying to what extent a complies with the expected changes in the design.

• In the Oslo experiment post-experiment interviews and feedback during the experiment was used to asses 1. how UML was used, and

2. the subjects’ perceptions of the costs and benefits of using it, as well as

3. how the subjects worked, and

4. what types of problems the subjects experienced on the different tasks.

Conduct of the Experiment

6/11 – 07 Bente Anda 27

Results

Group Task 1 Task 3 Task 4 Task 5

T No UML 75 20 22 54

UML 53 15 19 65

T’ UML 70 27 36 101

C No UML 46% 91% 91% 46%

UML 56% 89% 100% 89%

Group Task 1 Task2 Task 5 Task 6

T No UML 66.5 75 89.5 166.5

UML 82 87 99 141

T’ UML 128 149.5 141 168

C’ No UML 8/8 5/8 5/5 0/12

UML 8/8 5/8 5/5 5/12

Q No UML 4/9 8/8 4/4 ¼

UML 4/9 8/8 4/4 3/4

Oslo Ottawa

• The use of UML did not lead to improvements of time spent, except for the most complex task, task 6.

• Functional correctness was improved, especially for the most complex task

• Design quality was improved only for the most complex task

6/11 – 07 Bente Anda 28

BestWeb – A follow-up study with Professional Software Developers

• 20 senior consultants implemented the same five well-specified maintenance tasks on a medium-sized, real and non-trivial web-based system.

– They were divided into two (equal-sized) groups: one group worked within a UML environment and one worked in a traditional (non-UML) environment.

• Each consultant spent between 1 and 2 weeks implementing the tasks.

• The results confirm the results of the experiment:

– The subjects in the UML group had on average a 54% increase in the functional correctness of changes,

– a 7% overall improvement in design quality, though a much larger improvement was observed on the first change task (56%),

– at the expense of a 14% increase in development time caused by the overhead of updating the UML documentation.

6/11 – 07 Bente Anda 29

The ABB Case Study• Described in the lecture 25/9.

• Interviews and questionnaires were used to identify costs and benefits of introducing UML in a large safety-critical project.

• Benefits were identified regarding design and documentation (the developers didn’t have experience with maintaining software with the use of UML).

• Design was improved because:– of a greater focus on design than had been the case previously,– more people realized the importance of designing before coding, and – it was beneficial to have a design framework available before coding starts. In particular,

the interviewees considered that the use of sequence diagrams forced them to design thoroughly.

– Some found that there was not sufficient support in the method for combining top-down and bottom-up development, something which was necessary when many building blocks were already available in the form of hardware components or legacy code.

• Documentation was improved because:– the documents had a more unified structured with respect to content, and– more software developers could learn UML than learn to express themselves well in

English. In addition, several of the interviewees emphasized that the developers found it more fun to make diagrams than to write textual documentation; hence, they produced a more comprehensive set of analysis and design documents.

– The UML documents were, however, also often very large and therefore difficult to use.