9
V Workshop de Manutenção de Software Moderna (WMSWM) 1 Assessing Program Comprehension Tools with the Communicability Evaluation Method Denis Pinheiro 1 , Marcelo Maia 2 , Raquel Prates 1 , Roberto Bigonha 1 1 Departamento de Ciência da Computação – Universidade Federal de Minas Gerais 2 Faculdade de Computação – Universidade Federal de Uberlândia {denis,raquel,bigonha}@dcc.ufmg.br, [email protected] Abstract. Most program comprehension tools present information extracted from the source code in a visual way. The user interface of such a tool may support or hinder the strategy used by the programmer. In this paper a communicability evaluation method is conducted in two different comprehension tools and provide indicators on aspects that are relevant for the quality of the interface of such systems. 1. Introduction Understanding an already built software system is necessary because those systems are in permanent maintenance and evolution. Mayrhauser and Vans [1995] point out five basic tasks related to software maintenance and evolution: system adaptation, perfective and corrective maintenance, source code reuse and restructuring. The program comprehension activity is necessary nearly in all of the above tasks. Furthermore, large systems and outdated documentation usually add to the complexity of program comprehension task. Storey [2005] presents a study about theories and tools around the program comprehension activities. Program comprehension tools are designed according to several possible strategies used by a programmer while performing comprehension tasks. It is known that comprehension tools can better support or even hinder the comprehension strategies used by developers [Storey et al. 2000]. The functionalities supported by a tool, e.g., navigation, queries and multiple views are commonly implemented within a graphical user interface. It is well known in the human-computer interaction community that the success or failure of an interactive system is determined not only by the amount of delivered functionality, but also by how the designer chooses to present it to the user at the interface. This paper presents indicators on how the comprehension strategy chosen by the tool designer and its presentation to users affect users´ experience with two compre- hension tools that support different strategies, namely SHriMP [Storey et al. 2002] and Understand for Java [scitools.com]. Both are visual tools with graphical user interfaces showing diagrams and views built from Java source code analysis. However, SHriMP adopts an integrated comprehension strategy, whereas Understand for Java adopts a bottom-up one. We have performed a communicability evaluation with users interacting with both tools, and collected indicators on the tasks better supported by each tool, as well as on how the way these strategies were presented to the users impacted their experiences. The next secition presents some background on program comprehension strategies and reviews the comprehension tools used in the experiments. Section 3

40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

  • Upload
    others

  • View
    3

  • Download
    0

Embed Size (px)

Citation preview

Page 1: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

1

Assessing Program Comprehension Tools with the

Communicability Evaluation Method

Denis Pinheiro1, Marcelo Maia

2, Raquel Prates

1, Roberto Bigonha

1

1Departamento de Ciência da Computação – Universidade Federal de Minas Gerais

2 Faculdade de Computação – Universidade Federal de Uberlândia

{denis,raquel,bigonha}@dcc.ufmg.br, [email protected]

Abstract. Most program comprehension tools present information extracted from the source code in a visual way. The user interface of such a tool may support or hinder

the strategy used by the programmer. In this paper a communicability evaluation

method is conducted in two different comprehension tools and provide indicators on

aspects that are relevant for the quality of the interface of such systems.

1. Introduction

Understanding an already built software system is necessary because those systems are

in permanent maintenance and evolution. Mayrhauser and Vans [1995] point out five

basic tasks related to software maintenance and evolution: system adaptation, perfective

and corrective maintenance, source code reuse and restructuring. The program

comprehension activity is necessary nearly in all of the above tasks. Furthermore, large

systems and outdated documentation usually add to the complexity of program

comprehension task.

Storey [2005] presents a study about theories and tools around the program

comprehension activities. Program comprehension tools are designed according to

several possible strategies used by a programmer while performing comprehension

tasks. It is known that comprehension tools can better support or even hinder the

comprehension strategies used by developers [Storey et al. 2000]. The functionalities

supported by a tool, e.g., navigation, queries and multiple views are commonly

implemented within a graphical user interface. It is well known in the human-computer

interaction community that the success or failure of an interactive system is determined

not only by the amount of delivered functionality, but also by how the designer chooses

to present it to the user at the interface.

This paper presents indicators on how the comprehension strategy chosen by the

tool designer and its presentation to users affect users´ experience with two compre-

hension tools that support different strategies, namely SHriMP [Storey et al. 2002] and

Understand for Java [scitools.com]. Both are visual tools with graphical user interfaces

showing diagrams and views built from Java source code analysis. However, SHriMP

adopts an integrated comprehension strategy, whereas Understand for Java adopts a

bottom-up one. We have performed a communicability evaluation with users interacting

with both tools, and collected indicators on the tasks better supported by each tool, as

well as on how the way these strategies were presented to the users impacted their

experiences.

The next secition presents some background on program comprehension

strategies and reviews the comprehension tools used in the experiments. Section 3

Page 2: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

2

presents the Communicability Evaluation Method used to appreciate the quality of the

selected tools’ interfaces. Section 4 describes the experiment design and the results are

discussed in section 5. Finally section 6 presents our final remarks.

2. Program Comprehension Strategies and Studied Tools

Some strategies guide the way programmers work in a comprehension process [Storey

et al. 2000]. Top-down comprehension is a process to understand by reconstructing the

knowledge domain and mapping it onto the source code [Brooks 1983]. Bottom-up

comprehension is an understanding process that starts from the source code, and

successively, the software artifacts are grouped into higher level abstractions

[Shneiderman and Mayer 1979]. Systematic comprehension is a process to read the code

following the control and data flow [Soloway and Ehrlich 1986]. Knowledge-based

comprehension is a process based on the idea that the programmer can evolve the

comprehension either with bottom-up or top-down strategies [Letovsky 1986].

Integrated comprehension is a process that combines top-down, bottom-up and

knowledge based approaches into a single strategy [von Mayrhauser and Vans 1995].

The two systems used were SHriMP/Creole and Understand for Java. SHriMP

(Simple Hierarchical Multi-Perspective) [Storey et al. 2002] adopts an integrated strat-

egy, providing mechanisms to browse and to explore complex information spaces. It

provides visualization of nested graphs to present hierarchically structured information.

It introduces the concept of nested interchangeable views that allows users to explore

information in multiple perspectives and multiple abstraction levels. SHriMP may be

used to explore and to browse in the Eclipse IDE using the Creole plug-in.

SHriMP has several functionalities organized in windows encapsulated within

tabbed panes. The main window shows the system architecture built with colored nodes

and arcs. Nodes represent the program’s entities and can be expanded/collapsed in order

to show/hide the nested entities. The arcs represent the relationship between the entities,

and they can encapsulate several relationship occurrences of the same type between two

entities. Some visualization options are method call diagrams, dependency diagrams

and class hierarchy diagrams. Some algorithms are available to organize the diagrams.

A search tool is available to perform queries on entities in the program.

Understand for Java© (UJ) [scitools.com] is a source code analyzer that

supports programmers in understanding software projects written in Java. The source

code is analyzed and a repository with structures and relationships extracted from the

code is created. The repository is used to produce control flow diagrams, method call

graphs, inheritance hierarchy, and used to provide navigation and queries on source

code. The strategy adopted is bottom-up.

The tool interface is divided in three main areas: filter, information, and

document area. Filters enable the user to select specific entities in the source code based

on their types. The presentation of the entities’ information includes location, metrics

and relationships in the information browser. In the document area, several windows are

presented, including either a specific diagram or the source code. Diagrams and entities’

information are presented after selecting the entity name, or directly in the diagrams, or

activating a right click menu. Control flow diagrams, method call diagrams, inheritance

hierarchy diagrams, relationship between entities are presented using its own notation.

Page 3: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

3

3. The Communicability Evaluation Method

The Communicability Evaluation Method (CEM) is a qualitative method for user

interface evaluation, based on Semiotic Engineering (SemEng) theory [Prates et al.

2000, de Souza 2005, Prates and Barbosa 2007]. The SemEng perceives the system´s

interface as a designer-to-user communication. In this message the designer conveys to

users who the system is meant to, what problems it can solve and how to solve them [de

Souza 2005]. Communicability is a distinctive quality of interactive systems that

communicate efficiently and effectively to users their underlying design intent and

interactive principles.

The CEM involves users, who perform specific predetermined tasks in a

controlled environment, such as a user testing lab. Users receive scenarios describing

the tasks to be executed with the system being evaluated. The tasks are executed one at

a time and recorded using a screen capture software for further analysis. During the test,

evaluators instruct users, observe and take notes on users behavior or comments that

may strike them as relevant, but they do not interfere with task execution. At the end, an

interview with the user may be carried out in order to better understand some of the

actions and communicative breakdowns that may have occurred, and also to collect data

on user experience and satisfaction. The data analysis is comprised of 3 steps:

Tagging. Evaluator plays back the user interaction and identifies communicative

breakdowns, represented by specific interaction patterns. The evaluator then associates

an utterance to this breakdown, as if "putting words in the user´s mouth". For instance,

if the user stops the cursor over an interface element trying to see the hint associated to

it the evaluator would tag it with the utterance What’s this? or if the user was browsing

the menus looking for a specific option it would be tagged with a Where is it?. The

other utterances the comprise the predefined set are: What now?, Oops!, Where am I?,

I can’t do it this way., Why doesn’t it?, What happened?, Looks fine to me., I give up!,

I can do otherwise., Thanks, but no, thanks., Help!.

Interpretation. In this step, the evaluator tabulates the problems identified by the

breakdowns. During the interpretation process the evaluator should consider 4 factors:

(1) classification of the utterances according to the type of communicative failure they

represent in the system-user communication; (2) the frequency and context of

occurrence of each type of utterance; (3) the existence of patterned sequences of

utterance types; and (4) the level of goal related problems signaled by the occurrence of

utterance types. This step depends on the evaluator’s experience with the method and

knowledge of SemEng.

Semiotic profiling. The evaluator reconstructs the overall designer to user message.

Our analysis focuses on the tagging step and their impact to the users experience. The

complete interpretation and semiotic profiling are not discussed.

4. Experiment

The experiment was conducted with four users, each of them using both tools. The

goals of the experiment were: (1) collect information on the impact of the

comprehension strategy used by the system and that requested by the task; (2) identify

communicability problems in the selected tools that could hinder program

comprehension; and (3) based on (1) and (2) identify some relevant aspects of the

Page 4: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

4

interface designers should consider carefully during the project of a program

comprehension system.

Participants were chosen through a pre-test applied to graduate students from the

Computer Science Graduate Program at UFMG, who volunteered to participate in the

study. Two of the four selected participants had intermediate knowledge of Java and the

other two had advanced knowledge of Java. They all had familiarity with the Eclipse

development environment and none of them had interacted with the selected

comprehension tools before.

4.1. Experiment Organization

One of the factors that affected the choice of the comprehension tools was the different

comprehension strategies they adopted. SHriMP adopts an integrated comprehension

strategy, and the way the visualization is organized emphasizes top-down

comprehension. UJ adopts a bottom-up comprehension strategy. A communicability test

was conducted with the 4 participants performing 6 typical program comprehension

tasks. The tasks were divided into two groups: 3 tasks that required analysis of high-

level abstractions (e.g. models and diagrams) and 3 focused on low-level abstractions

(e.g. source code, flow control, method calls). Each participant used both tools,

performing 3 tasks with one tool and other 3 tasks with the other tool. The table below

shows how tasks (T), participants (P) and tools were divided. Tasks 1 to 6 were

performed in that order.

Higher-level tasks Lower-level tasks

T1 T2 T3 T4 T5 T6

P1, P2 SHriMP UJ

P3, P4 UJ SHriMP

4.2. Participant Instructions and Training

On arrival participants heard an explanation about the evaluation and its goal. They then

were given an overall explanation about the functionalities of the selected tools. They

were informed that the interaction needed to be recorded and gave their consent. They

were also informed that their identities would remain anonymous; they could interrupt

their participation at any time if they wished to and that they would have access to the

data collected during their tests at any time. The basic functionalities of the selected

tools were quickly demonstrated to the participants in order to provide an initial

knowledge about the tools to them. The demonstration took place through the

visualization of a small Java program implementing a banking application.

4.3. Execution and Pos-Test Interview

The tasks executed by the participants during the test were typical tasks of program

comprehension. A typical scenario of software maintenance requiring a comprehension

phase was presented to all participants. The participants performed the comprehension

tasks with a Battleship game program, implemented in Java. There are 15 classes

implemented in 625 lines of code, organized in 3 packages (grid, uinterface and util).

None of the participants had any knowledge about the implementation. Task execution

was recorded and an evaluator was present and took notes. Each participant’s session

lasted approximately an hour and a half.

Page 5: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

5

The 6 tasks were chosen based on the benchmark proposed in [Sim 2003] and

were: (1) Navigate the views created from source code by the tool and try to get an

overall architecture of the program. Draw a diagram for the architecture; (2) In which

blocks of the previous diagram were the game rules implemented?; (3) In the Battleship

game, is the mode “one user against the computer” available?; (4) What is the size of

the grid that defines the “Sea” of the game? Is the size of the grid fixed or can it be

redefined? If the size is fixed, how can the implementation be changed to support the

redefinition before the game starts, and if the grid is not fixed?; (5) How many ships can

be located in the grid (Sea)? If we want to add another kind of ship, which changes need

to be made to the program and where?; (6) What is your evaluation about the program

structure? Do you consider the Battleship game program was well-designed?. The first

three tasks required the participants to get an overall view of the program, but the third

task, also required source code inspection. The last three tasks required a lower-level

view of the program, that is, required visualizing the source code. There were tasks that

required understanding how to change the program.

At the end of the experiment the evaluator interviewed each participant. The

interview intended to allow evaluators to better understand some of the actions

observed; collect data on user experience.

5. Results

The first step of the analysis intended to identify strategies adopted by users at each

task, next the CEM tagging was performed. All participants completed all tasks but with

some variation on the detail level of the answers. Advanced users were more objective

in their answers, as opposed to intermediate users who tried to better explain their

answers. Each participant’s comprehension strategy was identified observing the

records. The participants used a bottom-up strategy with UJ because it was the only one

supported. They browsed the code and selected diagrams related to the entities in the

visualized code. Although SHriMP supports an integrated comprehension strategy,

participants adopted a top-down strategy. They kept navigating throughout the diagrams

to find and verify relationships. Even when they needed to visualize the code, they

navigated throughout the diagrams until they could locate points they could explore in

the source code, and only then did they visualize the code. It is possible that this choice

of strategy was motivated by the initial diagram presenting the overall architecture of

the program, from which the participants started the exploration of other diagrams.

During the tagging step users interactions were associated to utterances. Figure 1

shows the tags distributed along the timeline for each participant performing each task.

Notice that three tags occurred much more often than others, mainly Where is it?,

What´s this? and Oops!. The Where is it? tag could be observed frequently at the

beginning of a task execution. This breakdown occurred when users navigated

throughout the several views provided by the system looking for the necessary

information about the Battleship program to perform the task. It is interesting to notice

that although users were not looking for a function in the interface (usual interaction

pattern associated with Where is it? tag) they were trying to identify which view would

provide the desired information about the code. The occurrence of Where is it?

breakdowns are expected when first interacting with a system. However, the fact that its

frequency did not decline in time, as users learned the comprehension tool being used,

Page 6: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

6

as well as the Battleship program, may indicate a lack of clarity in the interface when

expressing the information extracted from the source code to the user.

Figure 1. Tagging results

The What´s this? tags occurred, mainly when using SHriMP. However, there

were 2 different situations that led to that tag. The first was the original one, that is,

when the user stops the cursor over an interface element and waits for the explanation

available at the hint. The other one represented the same interaction pattern, but with

different purposes. In SHriMP the abstract views would present detailed information by

use of the hint. For instance, in the view that showed the dependency between entities,

once the cursor was put on the arc that represented the dependency, dependency details

about which specific method depended on which other one was shown. In this case

users understood how to interact with the system and what the underlying intent was

and were able to use the system successfully. Thus, the request for detailed information

should not be considered a communicative breakdown.

Another frequent communication breakdown was associated to the Oops! tag. It

was mainly identified when users opened up a view, either a diagram or the source code,

and rapidly noticed that there was no useful information related to the task at hand on

that view, returning then to the previous view. This breakdown was more frequent when

using UJ. One participant commented that the Information Browser of UJ, that presents

textual information (e.g. metrics, declarations, uses, dependencies, etc.) about the

entities was the only really useful functionality in that tool.

There was an instance of communicative breakdown that is interesting to

discuss, even though it only happened in one of the tests. In SHriMP, the visualization

of the source code is limited to a small window. Most users complained about this

feature. The utterance Thanks, but no, thanks! occurred when participant 3 was

executing task 5. This participant needed to navigate throughout the source code to

Page 7: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

7

understand specific details of the implemention, but he refused to use the SHriMP

visualization option and decided to visualize the code directly with the Eclipse editor.

Even though it occurred only once, it is relevant because users are declining a

visualization offered by the designer, one of the main features of comprehension tools.

Thus, declining this specific feature could lead declining the system as a whole.

Breakdowns associated to tags What happened?, Why doesn´t it? and I can´t do

it this way. were also identified. However, since they happened with such low

frequency they did not provide any conclusive indicators in this experiment.

Analyzing the results of the interview, it could be noticed that users considered

both tools easy to use. However, they pointed out that the dependencies view of both

presented interaction difficulties. SHriMP represented all the relationships in this view

with arcs. It resulted in an overwhelming amount of information to users. UJ on the

other hand, presents each kind of relationship on a different window, making it

necessary for users to manage several windows. Most participants believe that the

dependencies view is crucial for the effectiveness of a program comprehension tool.

The interviews also showed that participants had a better experience with

SHriMP diagrams because they resemble UML diagrams, and the icons representing the

entities followed the Eclipse standard. Although, the diagrams of UJ were considered

easy to understand, users pointed out that they felt it was easier to understand the

information by using SHriMP. They believed the main reason for that was the use of

standard known notation.

6. Final Remarks

This paper presented an evaluation of the user interface of two comprehension tools:

SHriMP and UJ. The user interfaces were evaluated using the CEM. Being a qualitative

method, the goal was not to obtain statistically valid results, but rather indicators of

relevant problems related to the interactive program comprehension systems.

In order to collect information on the impact of the comprehension strategy, the

experiment was planned to include tasks that were favored by system strategy (i.e.

abstract tasks favored by top-down strategy), as well as tasks that might be hindered by

the strategy (i.e get low level code information by use of a top-down strategy). The

study showed that the strategy adopted by users was mainly imposed by the tool, and

less influenced by the task. In UJ the abstract levels offered by the system were not

enough to provide the user with the information needed by the high level abstraction

tasks. Users had to create the abstractions based on existing views that favored a

bottom-up strategy. SHriMP took an integrated perspective, but favored a top-down

approach. When users performed the low level tasks that required a detailed view of the

code, they used a top-down strategy to find and access it. This result corroborates the

findings presented by [Storey et al. 2000] and points to the need to better understand the

factors that influence program comprehension systems effectiveness.

In that direction, indicators show that it may be interesting for tools to support a

quick understanding of the overall architecture of the system, as well as a good

visualization of the source code. In other words, provide both top-down and bottom-up

strategies that allowed users to decide on and adopt the best one for the task at hand.

The participants also pointed out that one of the most important features for program

Page 8: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

8

comprehension tools is the visualization of system dependencies, since it often is a

bottleneck in the comprehension process.

The analysis of the experiment showed that most of the communicative

breakdowns were related to finding the desired information in one of the many possible

views. It would be interesting to conduct a study of the use of these systems during a

longer period of time to see whether these breakdowns would decrease and users would

be able to understand what was depicted in the different views or if they would end up

declining some of the views and using a subset of them. A study in a longer period of

time would also allow for an observation whether other breakdowns that had only a few

occurrences in this study would increase. In regard to the views that offered a more

detailed level of information on demand, it would be interesting to investigate how well

they supported users in navigating through different levels of abstraction.

References

Brooks, R. E. (1983). Towards a theory of the comprehension of computer programs.

International Journal of Man-Machine Studies, 18(6):543–554.

de Souza, C. S. (2005). The semiotic engineering of humancomputer interaction. The

MIT Press, Cambridge, MA.

Letovsky, S. (1986). Cognitive processes in program comprehension. In 1st Workshop

on Empirical Studies of Programmers, pages 58–79, USA. Ablex Publishing Corp.

Prates, R. O. and Barbosa, S. D. J. (2007). Introdução à teoria e prática da interação

humano computador fundamentada na engenharia semiótica. In T.Kowaltowski e K.

K. Breitman (Org.), Atualizações em Informática 2007, pp 263–326, SBC 2007.

Prates, R. O., de Souza, C. S., and Barbosa, S. D. (2000). A method for evaluating the

communicability of user interfaces. ACM interactions, 7(1):31–38.

scitools.com. Java editor with reverse engineering, code navigation and automatic

documentation. Scientific Toolworks Inc. http://www.scitools.com/uj.html.

Shneiderman, B. and Mayer, R. (1979). Syntactic/semantic interactions of programming

behaviour: A model. Int´l Journal of Comp. and Information Sciences, 8(3):219–238.

Sim, S. E. (2003). A Theory of Benchmarking with Applications to Software Reverse

Engineering. PhD thesis, Dept. of Computer Science. University of Toronto, Canada.

Soloway, E. and Ehrlich, K. (1986). Empirical studies of programming knowledge.

Readings in artificial intelligence and software engineering, pages 507–521.

Storey, M.A. (2005). Theories, methods and tools in program comprehension: Past,

present and future. In: Proc. of the 13th Int´l Workshop on Program Comprehension,

pp 181–191, USA. IEEE Computer Society.

Storey, M.A., Best, C., Michaud, J., Rayside, D., Litoiu, M., and Musen, M. (2002).

SHriMP views: an interactive environment for information visualization and naviga-

tion. In Extended Abstracts CHI 2002, pp. 520–521, USA. ACM Press.

Storey, M., Wong, K., and Müller, H. (2000). How do program understanding tools

affect how programmers understand programs? Sci.Comp. Prog., 36(23):183–207.

Page 9: 40351€¦ · Title: Microsoft Word - 40351 Author: sony Created Date: 5/21/2008 3:12:29 AM

V Workshop de Manutenção de Software Moderna (WMSWM)

9

von Mayrhauser, A. and Vans, A. M. (1995). Program comprehension during software

maintenance and evolution. Computer, 28(8):44–55.