Supporting Usability Evaluation of Multimodal Man- Machine ...Marco.Winckler/publications/2006-SPACEOPS.pdf · can support the usability evaluation phase. As a case study we present

American Institute of Aeronautics and Astronautics

1

Supporting Usability Evaluation of Multimodal Man-

Machine Interfaces for Space Ground Segment Applications

Using Petri net Based Formal Specification

Philippe Palanque*

University Paul Sabatier, Toulouse, 31062, France

Regina Bernhaupt†

Universität Salzburg, Salzburg, 5020, Austria

David Navarre‡

University Toulouse 1, Toulouse, 31042, France

Mourad Ould§

CNES, Toulouse, 31041, France

and

Marco Winckler**

University Paul Sabatier, Toulouse, 31042, France

This paper describes the issues raised by the evaluation of multimodal interfaces in the

field of command and control workstations. Design, specification, verification and

certification issues for such Man-Machine Interfaces (MMIs) have been already identified as

critical activities. This paper focuses on the issues raised by evaluation of their usability

evaluation. We first present a formalism (Interactive Cooperative Objects) and its related

case tool (PetShop) for the specification of such MMIs and then show how the models built

can support the usability evaluation phase. As a case study we present a multimodal

interaction for 3D navigation in a 3D satellite model.

I. Introduction

he importance of the Man-machine Interface (MMI) part of Space Ground Segment Information Treatment

applications is significant and increasing. The same holds for costs of theses MMIs throughout the various

phases of their life cycle, namely design, development, operation and maintenance.

Current research work in the field of Human Computer Interaction promotes the development of new interaction

and visualization techniques in order to increase the bandwidth between the users and the systems they are

interacting with. Such an increase in bandwidth can have a significant impact on efficiency. For instance the number

of commands triggered by the users within a given amount of time and the error rate, typically the number of slips or

mistakes40 made by the users, are influenced by the user interface.

On the interaction technique side, these new technologies promote the use of multimodal interfaces allowing

users to interact with the system by entering data or commands using a combination of several input devices. In

addition to "traditional" devices such as keyboard and mouse other “new” devices are available for the designers.

Such devices include tactile screens, voice recognition systems, speech synthesisers, haptic devices (possibly

providing force feedback) and eye tracking (allowing visual pointing).

* Professor, LIIHS-IRIT, 118, Route de Narbonne, 31062 Toulouse cedex 4. † Dr. ICT&S-Center, Universität Salzburg Sigmund-Haffner-Gasse 18, 5020 Salzburg. ‡.Dr. Place Anatole France, 31042 Toulouse cedex. §.CNES (Centre National d’Etudes Spatiales), 18 avenue Edouard Belin, 31041 Toulouse cedex 4. ** Dr. Place Anatole France, 31042 Toulouse cedex.

T


2

The research results presented in this paper have been produced within a research project that targeted three main

objectives:

• study the potential contribution of these new MMI technologies for our Space Ground Segment IT

software,

• evaluate their adequacy with our needs in order to anticipate the design problems of the future MMIs and at

the same time to assess our space systems,

• Develop interactive applications with the same quality as non interactive applications i.e. through a

rigorous development process.

One of the key issues of the research carried out in the project is to find efficient ways of bringing together two

separated (and often opposite) issues such as usability and reliability. Indeed, the continuously increasing

complexity of the information manipulated by safety critical interactive systems calls for new interaction techniques

increasing the bandwidth between the system and the user. Multimodal interaction techniques are considered as a

promising way of tackling this problem. However, the lack of engineering techniques and processes for such

systems makes them hard to design and to build and thus jeopardizes their actual exploitation in the area of safety

critical applications. Space Ground Segment IT software is one the application domains where such failures can

have a catastrophic impact on the equipment or data under manipulation.

In a previous paper36 we presented the challenges provided by multimodal and 3D user interfaces as far as

software design, specification and verification are concerned. We proposed a new formalism dedicated to the formal

specification of such interfaces and presented a software CASE (Computer Aided Software Engineering) tool

dedicated to this formalism. This tool called PetShop (for Petri nets workshop) allows software developers to edit

and modify the models of the multimodal interface. This tool, a tutorial and a set of examples are available on our

research group web site: http://liihs.irit.fr/petshop

The current paper is dedicated to the last phase of the research project. After the phase of defining notations and

tools for the specification of multimodal interfaces, we are presenting how these components can support the

usability evaluation phase. Indeed, the fact that both the interaction technique and the entire application have been

formally specified makes it possible to exploit that information (usually not available in a “classical” development

process of interactive applications).

The paper is structured as follows. The next section (section 2) presents the state of the art in the field of

usability evaluation methods for multimodal interfaces. This section first presents a brief introduction about

multimodal interaction and multimodal interfaces and then compares the current practice in multimodal interfaces

usability evaluation. Section 3 informally presents the Interactive Cooperative Objects formal description technique.

It also describes the specificities of multimodal interfaces and their impact on the expressive power and verification

techniques. Section 4 introduces the case study used for providing a concrete example of multimodal interaction

techniques and for showing how the ICO formalism is applied. The case study is used here, as a proof of concept

and does not correspond to any Space Ground Segment application currently deployed. The last section (section 5)

details through the explicit representation of user goals, tasks and interaction scenarios how usability evaluation

could be conducted for these kinds of systems. It also shows how the formal specification technique can provide

useful precise information for supporting this task.

II. Usability evaluation of multimodal interfaces

A. Introduction to interactive systems and multimodality By definition, multimodal interaction techniques make it possible for the users to interact with the application

using several modalities. A modality is defined as a couple made up of a device (input or output) and an interaction

language. Thus multimodality can take place in two different ways:

• input multimodality that involves the use of several input devices such as a mouse, keyboard, voice

recognition system, gaze recognition,…;

• output modality that involves the use of several output devices such as screen, spatialized sound, 3D

visualisation systems, …


3

Figure 1. Two handed interaction on a tablet PC (extracted from45)

Today, multimodal interaction techniques are used in almost all areas ranging from business software to

embedded systems such as cockpits of military aircrafts7.

In this project we focus on input multimodality supporting the use of speech input and input by two mice. The

successful use of two handed interaction has been shown in9. Using a toolglass and magic lenses, the user is able to

select a color at the same time as choosing an object. A similar two handed interaction for coloring objects has been

shown in45 for pen and touch input (see Figure 1).

A previous study37 has shown that, in the field of safety critical systems, multimodal interaction presents several

advantages.

• Multimodality increases reliability of the interaction. Indeed, it permits to drastically decrease critical error

(between 35% and 50%) during interaction. This advantage, on its own, can justify the use of multimodality

when interacting with safety critical systems.

• It increases efficiency of the interaction, in particular in the field of spatial commands (multimodal

interaction is 20% more rapid than classical interaction to specify geometric and localization information).

• Users predominantly prefer interacting in a multimodal way, probably because it allows more flexibility in

interaction thus taking into account users' variability.

• Multimodality allows increasing naturalness and flexibility of interaction so that the learning period is

shorter.

However, assessing quantitatively and in a predictive way both efficiency and usability of multimodal interactive

systems is still considered as a difficult problem by the research community in the field of human-computer

interaction. To structure the issues, a set of multimodal properties, called the CARE properties, have been identified

in17. According to this study, input and output modalities can be combined in four ways: complementarily,

assignment, redundancy and equivalence. These properties can be used both at design time when designers have to

define how multimodal interaction will take place and also at exploitation time when the users will actually use the

system and select how they will interact with the system. For instance if the designer provides two equivalent

modalities for triggering a command, the user will be able to choose any of them while interacting with the system.

Assignation of a given modality to a given command requires, on the other side, the user to use only that modality

for triggering this command.

As CARE properties are more oriented towards design and use of modalities, they are of little help as far as the

precise specification or the construction of the system is concerned. The other classification17 proposed a system’s

view on applications featuring multimodal interactions. Indeed, one of the key elements of multimodal interfaces is

related to the fusion of information provided by several devices used in a concurrent way by the user.

This classification (see Table 1) is structured according to 2 criteria:

1. Use of modalities: this criterion indicates if two modalities can be used in parallel (i.e. at the “same” time)

or in sequence (i.e. one after the other),

2. Interpretation: this criterion indicates if pieces of information coming from two different modalities are

fused to define a new command. This is the case, for instance, when the user triggers the command using voice

recognition and provides the parameter for this command using the mouse like in saying the word “delete” and

clicking simultaneously on a graphical object to be deleted.


4

Use of modalities

Sequential Parallel

Combined Alternating. Synergistic

Interpre-

tation

Independent Exclusive Concurrent

Table 1. Usage and interpretation of modalities

The table shows that there can be four kinds of multimodal interaction both addressing input and output:

• Exclusive: is the poorer kind of multimodality. Modalities can only be used in a sequential way and

information provided by two different modalities are not fused.

• Concurrent: Modalities can be used both in sequence and in parallel, but information provided by two

different modalities are not fused for triggering a command. So modalities have to be used for different tasks

and in a non correlated way.

• Alternating: Modalities can only be used in a sequential way and information provided by the different

modalities is fused. Fusion/fission mechanisms can be applied to trigger specific commands if the

information content is compatible.

• Synergistic: it is the more complex and powerful kind of multimodality. Modalities can be used both in

sequence and in parallel and information provided by two different modalities can be fused.

B. Usability Evaluation methods for Multimodal Interfaces

Multimodal User Interfaces (MUI) are often referred to be more natural to users than “uni-modal” user

interfaces (e.g. graphical UI) because they allow making better use of users’ communication channels. This

assumption is supported by studies on two-handed multimodal user interfaces which have shown that using two

pointing devices in a normal graphical user interface has been found to be more efficient and understandable than

the basic mouse and keyboard interfaces as described in (Ref. 13, 26, 46). However, multimodal interaction only

benefits when the design takes into account the human abilities and the appropriate selection of communication

channels. Several studies have revealed that when these interfaces are designed poorly, hey are neither better

understood nor more efficient (Ref. 19, 26).

Figure 2. An overview of evaluation methods for multimodal user interfaces.

The combinations of input and output devices and interaction modalities have opened a complete new world of

experience for interactive systems and it poses the question on how to accurately evaluate the usability of

multimodal user interfaces. The results of existing empirical studies of such as multimodal applications revealed a

very intricate problems concerning the assessment of multimodal technology where individual user preferences for

modality18, context of use and kind of activity supported by the system (task-oriented versus non-task oriented)26,

UEM

Inspection

User Testing

Heuristic Evaluation

Cognitive walkthrough

Inquiry

Analytical Modelling

Thinking-aloud protocol

Wizard of Oz

Log file analysis

Questionnaires (satisfaction)

Questionnaires (workload – NASA-TLX)

Field observation

Task model analysis

Performance model


5

use of specific devices2,24 and interaction technique32 (e.g. pointing x clicking), play major role to determine the

usability.

There have been attempts to adapt traditional usability evaluation methods for use in multimodal systems, and a

few notable efforts to develop structured usability evaluation methods for multimodal applications. Regarding the

current practice of usability evaluation of multimodal systems, we may distinguish in Figure 2 four main

approaches: a) theoretical frameworks based on inspection methods, b) empirical investigation based on user testing,

c) user inquiry, and d) analytical modeling.

Despite the great importance of inspection methods such as Heuristic Evaluation33 and cognitive walkthrough28

for usability evaluation of user interfaces, they have been found less useful to assess multimodal user interfaces.

The great majority of studies still employ some kind of user testing where user activity is measured while

performing some pre-defined tasks. Log file analysis39 and the think-aloud protocol is frequently employed in both

laboratory and in field observation25 with advanced prototypes. Nevertheless, mockups and early prototypes have

also been testing using the method Wizard of Oz27 technique. User testing seems to be a preferred strategy for

evaluation of many studies of multimodal user interfaces because it allows the investigation of how users adopt and

interact with multimodal technology.

Questionnaires have been extensively employed to obtain qualitative feedback from users (e.g. satisfaction,

perceived utility of the system, user preferences for modality) as well to assess cognitive workload42 (especially

using the NASA-TLX method). Questionnaires have quite often been used in combination with user testing

techniques25.

More recently, two main approaches have been developed to support analytical modeling which intends to

predict usability of multimodal user interfaces. On one hand simulation and model-based checking of task

specifications are used to predict usability problems such as unreachable states of the systems or conflicting events

required for fusion. (Ref. 39) propose to combine task specification based on ConcurTaskTree (CTT) with multiple

data sources (e.g. eye-tracking data, video records) in order to better understand the user interaction and the task

models used to support the development process of multimodal user interfaces. On the other hand, analytical

modeling based on fitness functions38 and Fuzzy Logical Model of Perception29 (FLMP) that belongs to the category

of predictive analytical modeling try to mathematically explain how users interact with the system. These

approaches highlight how humans benefit from multiple sources of information from multiple modalities but they

cannot be used alone for usability assessment.

III. ICO a formal description technique for safety critical interactive systems

Design, specification and verification of interactive systems is very complex and classical techniques and tools

in the field of software engineering do not provide adequate support for these kinds of software systems. Complexity

increases when more sophisticated interaction techniques (such as multimodal interaction) are made available to the

users. We believe that the use of an adequate formal description technique can provide support for a more systematic

development of multimodal interactive systems. Indeed, formal description techniques allow for describing a system

in a complete and non-ambiguous way thus allowing an easier understanding of problems between the various

persons participating in the development process. Besides, formal description techniques allow designers to reason

about the models by using analysis techniques. Classical results can be the detection of a deadlock or presence or

absence of a terminating state. As stated above, a set of properties for multimodal systems have been identified17 but

their verification over an existing multimodal system is usually very difficult to achieve. For instance it is very

difficult to guarantee that two modalities are redundant whatever state the system is in.

C. Related Work on Engineering Multimodal Applications

This paper exploits a formal description technique that we have defined. This proposal builds upon previous

work we have done in the field of formal description techniques for interactive systems and is an answer to several

requests from industry to provide software engineers with software engineering techniques for multimodal systems.

Work in the field of multimodality can be sorted into five main categories. Of course, the aim of this

categorization is not to be exhaustive but to propose an organization of previous work in this field.

• Understanding multimodal systems: (Ref. 16) presents a typology of multimodal systems while (Ref. 17)

deals with properties of multimodal systems.

• Software construction of multimodal systems: (Ref. 8) and (Ref. 14) propose toolkits for the construction

of multimodal systems, and (Ref. 34) proposes a generic software architecture for multimodal systems.

• Analysis and use of novel modalities: (Ref. 11) presents the first use of voice and gesture as combined

modalities. (Ref. 12) introduces two handed interaction (Ref. 10) introduces the use of two handed


6

interaction for virtual reality applications and (Ref. 44) presents Jeanie, a multimodal application, to test the

use of eye tracking and lips movements recognition.

• Multimodal systems description: (Ref. 15) presents QuickSet a cooperative interface using both voice and

gesture, while (Ref. 34) presents a Multimodal Air Traffic Information System (MATIS) using both voice

and direct manipulation interaction. Similarly, (Ref. 9) presents a drawing system featuring two handed

interaction through a trackball and a mouse.

• Multimodal systems modeling: (Ref 1) exploits high-level Petri nets for modeling two handed interaction

(a mouse and a trackball) and (Ref 18 and Ref 45) use finite state automatons for modeling two handed

interaction.

D. Informal Description of ICOs The aim of this section is to recall the main features of the ICO (Interactive Cooperative Objects) formalism that

we have proposed for the formal description of interactive system. The formalism will be used for the case studies

and performance evaluation in the next sections. We encourage the interested reader to look at (Ref. 4) and (Ref. 43)

for a complete presentation of this formal description technique.

The Interactive Cooperative Objects (ICOs) formalism is a formal description technique dedicated to the

specification of interactive systems (Ref. 5). It uses concepts borrowed from the object-oriented approach (dynamic

instantiation, classification, encapsulation, inheritance, client/server relationship) to describe the structural or static

aspects of systems, and uses high-level Petri nets (Ref. 22) to describe their dynamic or behavioral aspects.

ICOs are dedicated to the modeling and the implementation of event-driven interfaces, using several

communicating objects to model the system, where both behavior of objects and communication protocol between

objects are described by Petri nets. The formalism made up of both the description technique for the communicating

objects and the communication protocol is called the Cooperative Objects formalism (CO and its extension to

CORBA COCE (Ref. 43)).

In the ICO formalism, an object is an entity featuring four components: a cooperative object with user services, a

presentation part, and two functions (the activation function and the rendering function) that make the link between

the cooperative object and the presentation part.

Cooperative Object (CO) part: a cooperative object models the behavior of an ICO. It states how the object

reacts to external stimuli according to its inner state. This behavior, called the Object Control Structure (ObCS) is

described by means of high-level Petri net. A CO offers two kinds of services to its environment. The first one,

described with CORBA-IDL35, concerns the services (in the programming language terminology) offered to other

objects in the environment. The second one, called user services, provides a description of the elementary actions

offered to a user, but for which availability depends on the internal state of the cooperative object.

Presentation part: the Presentation of an object states its external appearance. This Presentation is a structured set

of widgets organized in a set of windows. Each widget may be a way to interact with the interactive system (user-

towards-system interaction) and/or a way to display information from this interactive system (system-towards-user

interaction).

Activation function: the user-towards-system interaction (inputs) only takes place through widgets. Each user

action on a widget may trigger one of the ICO's user services. The relation between user services and widgets is

fully stated by the activation function that associates to each couple (widget, user action) the user service to be

triggered.

Rendering function: the system-towards-user interaction (outputs) aims at presenting to the user the state

changes that occurs in the system. The rendering function maintains the consistency between the internal state of the

system and its external appearance by reflecting system states changes.

ICOs are used to provide a formal description of the dynamic behavior of an interactive application. An ICO

specification fully describes the potential interactions that users may have with the application. The specification

encompasses both the "input" aspects of the interaction (i.e. how user actions impact on the inner state of the

application, and which actions are enabled at any given time) and its "output" aspects (i.e. when and how the

application displays information relevant to the user).

An ICO specification is fully executable, which gives the possibility to prototype and test an application before it

is fully implemented (Ref. 6). The specification can also be validated using analysis and proof tools developed

within the Petri nets community and extended in order to take into account the specificities of the Petri net dialect

used in the ICO formal description technique. This formal specification technique has already been applied in the

field of Air Traffic Control interactive applications. A case study on this field can be found in (Ref. 30).


7

IV. Multimodal interaction on a 3D satellite model

This section presents the exploitation of the formalism presented in previous sections to a case study in the field

of space applications. Even though the application is not one currently used by “real” users it has been designed in

order to cover a wide range of issues spanning from formal specification of interactive behaviors to usability

evaluation.

E. Informal presentation of the case study This application provides multimodal interaction techniques to a user moving the point of view (we will later call

this “navigating”) in a 3D model of a satellite. This navigation can be done either by rotating the 3D model of the

satellite directly using the mouse on the 3D image or using the two control panels presented in Figure 3.

a) b) c)

Figure 3. The 3D representation of Demeter satellite (a) and its two control panels (b and c)

The control panel (b) entitled “point de vue” allows the user to manipulate the current position of the point of

view of the 3D image using the set of buttons in the top right hand side of Figure 3b. The set of buttons in the

“orientation” section allows rotating the satellite image in any direction. The two list-boxes on the left hand side

present respectively the list of components of the satellite and the list of categories the components belong to. We do

not present the other parts of the user interfaces as they are beyond the scope of this paper.

At the beginning the satellite appears as presented in Figure 3a). The main task given to the user of this

application is to locate one or several components in the satellite. This task is not easy to perform as components are

nested and might not be visible. Thus the user interface offers the possibility to make the set of components selected

from the list partly or fully transparent. This transparency is set by means of the Transparence slider on the right

hand side of Figure 3c). The goal of the user is to locate components that can be of two types: overheating and over-

consuming. The selection of the range of temperature of interest and the range of consumption can be done using the

range slider in the section “données” in the right hand side of Figure 3c. Figure 4 presents a snapshot of the satellite

3D model including the temperature of the visible components.


8

Figure 4. 3D satellite model displaying the temperature of the visible components

In this application multimodal interaction takes place both while using the button pairs, changing the point of

view of the 3D model, and while interacting with the range slider for selecting the temperature and the consumption.

Due to space constraints we only present here multimodal interaction on the button pair. The interested reader

can see the formal specification of a similar multimodal range slider component in (Ref 21).

Figure 5. One of the multimodal interactions in the application

Figure 5 shows the multimodal interaction in action. In this figure the user is currently using 3 input devices at a

time: 2 mice and 1 speech recognition system. The speech recognition systems allows only for entering 2 different

words: “fast” and “slow”. The interaction takes place in the following way: at any time the user can use any of the

mice to press on the buttons that change the point of view. In Figure 5, the button that moves the satellite image

backwards (with the additional label right mouse interaction on Figure 5) has been pressed using the right mouse.

Simultaneously the left mouse has been positioned on the button moving the satellite image to the left. At that time

the image has already started to move backwards and as soon as the other button will be pressed the image will be

moving both backwards and to the left. The user is also able to increase or decrease the movement speed by uttering

the words “fast” and “slow”. In Figure 5 the word “fast” has been pronounced and recognized by the speech

recognition system and is thus (as shown on the left-hand side of Figure 5). This action will reduce the time between

two movements of the image. Indeed, the image is not moved according to the number of clicks on the buttons but

according to the time the buttons are kept pressed by the user.

Describing such interaction techniques in a complete and unambiguous way is one of the main issues to be solved

while specifying and developing multimodal interactive systems. The next section presents how the ICO formalism

is able to deal with these issues. Additionally it will show that the description above is incomplete and does not

Left mouse interaction

Right mouse interaction

Disabled button

Speech recognition output


9

address at an adequate level of detail both timed and concurrent behavior at least when it comes to implementation

issues.

F. ICO modeling of the case study This section is devoted to the formal modeling of the multimodal interactive application presented in section E.

In this multimodal application there is no fusion engine per se, the two mice are handled independently and the

speech interaction affects movement speed whatever interaction is performed with the mice.

The modeling is structured as represented in Figure 6. The right hand side of the figure shows the user

interacting with the input devices. As stated before three input devices are available. In order to configure this set of

input devices we use a dedicated notation called Icon14. A more readable model of this configuration is represented

in Figure 7.

IVY bus

Figure 6. Software architecture of the multimodal interactive application

The left hand side of Figure 7 represents the 3 input devices connected to software components. These

components are represented as graphical bricks and connectors model the data flow between these bricks. For

instance it defines that interaction with the mice will take place using the left button (but1 in the usbMouse brick)

and that the alternate button for the speech recognition system is the space bar (Space label in the keyboard brick

connected to the speechCmd brick). The right hand side of this figure represents contact points with the other

models of the application. As input configurations are not central to the scientific contribution of this paper we do

not present in more detail how this modeling works. More information about the system supporting the edition and

execution of models, the behavior of a model and the connections to other models can be found in (Ref. 31).

Similarly, functional core and communication protocol between the functional core and the interaction models are

not presented.


10

Figure 7. Input configuration using ICon14

Figure 8. Model of the temporal evolution of movements driven by speech (continuous move in Figure 6)

The ICO model in Figure 8 represents the complete and unambiguous temporal behavior of the speech-based

interaction technique as well as how speech commands impact the temporal evolution of the graphical representation

of the 3D image of the satellite. Darker transitions are available according to the current marking of the models.

Taking into account the current marking of the model of Figure 8 (one token in each place delay, Idle and core) only

transitions startMove_1, faster_ and slower_ are available. These transitions describe the multimodal interaction

technique available i.e. how each input device can be used to trigger actions on the system. Transitions faster_ and

slower_ are triggered when the user utters one of the two speech commands fast and slow. In the initial state these

are available and will remain available until the upper limit and the lower limit are reached (respectively delay>1000

for transition slower_ and delay<100 for transition faster_).


11

Figure 9. Mutual exclusion of the pair of buttons for changing the point of view (button pair in Figure 6)

Figure 9 presents another model of the application, responsible for describing the behavior of each button pair.

By button pair we mean the buttons that are antinomic, namely (up, down), (left, right) and (backwards, forward).

These 3 button pairs are represented on the right hand side of Figure 5. To model this antinomy the ICO description

represents the fact that the user can press either the positive or negative button. Once pressed, these buttons can be

released when the left button on the input device is released (as represented in the ICon model of Figure 7).

G. Low level interaction constraints Models of Figure 8 and Figure 9 integrate some constraints for the user. These constraints can result from the

design process and design choices (the 3D image will not move faster than one modification every 100 milliseconds)

or physical constraints (one button on a mouse can only be either pressed or released).

All of these constraints are made explicit in the models and thus provide a unique source of information about

the actual precise behavior of the interaction technique. Thanks to the expressive power of the underlying Petri net

formalism used in ICO, concurrent behavior can be described together with quantitative temporal evolution (the

image will change every 100 milliseconds). Indeed, Petri nets is the only formalism able to express concurrency and

both quantitative and quantitative temporal evolutions.

For example the ICO model makes explicit the speech-driven temporal evolution of the model. The value of the

delay between two images is stored in a variable called Delay. This variable is then used as the timer in the

transition Sleep (bottom right hand side of Figure 8). This construct is defined in Generalized Stochastic Petri nets3

and behaves as follow. According to the buttons pressed using the mice, a token containing this information will be

set in place moving. While the buttons are pressed the transition move is fired performing the calculation of the 3D

image and rendering it on the screen. This will move the token from place core to place done making the transition

move unavailable (there is no token in one of its input place (the place core)). Transition Sleep will become

available (one token in each input place (only place done)) but will not fire as it is a timed transition. Indeed, the

transition will wait until the amount of time delay (the variable delay is storing a number of milliseconds) has

elapsed before firing. When this amount of time has elapsed the transition will fire removing the token from place

done and setting a new one in place core. After this loop, transition move will become available again and thus be

ready to render a new 3D image of the satellite. As transitions startMove_2 and stopMove remain available during

the loop it makes explicit the fact that users can press or release any button on the user interface at any time.

V. Model-Based Usability Evaluation

H. Basic Principles of Model-Based Evaluation

As presented in section B, to ensure usability of these kinds of applications, various methods from the field of

human-computer interaction (HCI) can be applied. Even though any kind of usability evaluation method can help in

this process we will focus on applying usability tests.


12

A typical usability test is performed in a laboratory (sometimes in the field), were users are asked to perform

selected tasks. The users are observed by cameras, and they might be asked to talk aloud (also called elicitation

activity) while performing the task. A usability test typically begins with asking the users a pre-questionnaire related

to the domain of the software (use of other related systems, experience with multimodal-interfaces, hours of training

…). Some tasks are then performed to ensure that the user is able to use the system. Testing multimodal interactions

usually requires an additional activity corresponding to the presentation of the key input modalities to the user. The

user then performs the tasks. Tasks have to be completed within a given time. If the user cannot solve the task within

this time, the experimenter (leader of the usability test) helps the user by giving hints or providing the solution. The

number of successful completions and the completion time are recorded. Tasks not solved indicate usability

problems, leading to further detailed investigations of the problems. For an example of a usability test recording see

Figure 10.

Figure 10. Example of usability test in action

When complex interaction techniques are considered (as in the current application) the presentation of the

application to be tested with the user also requires a description of the actual interaction technique. This description

goes beyond the typical high-level (task-based) scenarios promoted by usability testing methods. The goal of a

usability test is to improve the interface by finding out major usability problems within the interface. While a

common practice is to use the most frequently performed tasks (based on the task analysis), in the field of safety

critical systems, it is important to cover all (or most of) the possible interactions that the user might be involved in.

The explicit description, in the formal models, of the interaction techniques makes it possible to identify not only the

“minimum” number of scenarios to be tested but also to select more rationally the tasks that are to be focused on.

When testing multimodal interfaces, this constraint reaches a higher level of complexity due to the significant

number of possible combinations of input modalities and also due to the fact that fusion engines usually involve

quantitative temporal evolution as presented in section F. In order to test all (or most) of these combinations it is

required to provide usability tests scenarios at a much lower level of description than what is usually done with more


13

classical systems. Indeed, as for walk-up and use systems, the interaction technique must be natural enough for the

user to be able to discover it while interacting with the system.

Even though we need to address this issue of low level scenarios it is also important to notice that usability

testing is very different from software testing. The objective here is to test the usability of the interaction technique

and not its robustness or default-freeness like in classical software testing. The issue of reliability testing of

multimodal interactive systems is also very important but it is beyond the scope of this paper. Formal methods can

help to specify the “real” number of low-level interaction scenarios and thereby inform selection of tasks more

appropriately.

It is important to note that we are not claiming that current practices in the field of usability evaluation must

involve model-based usability evaluation. Our claim is that in the field of safety critical interactive systems and

more specifically when multimodal interaction techniques are considered, model-based approaches can support

specific activities (like low-level tests scenarios and tasks identification) that could be otherwise overlooked or not

systematically considered.

The next section presents examples of low-level interaction scenario descriptions for the case study.

I. Example of Model-Based Evaluation on the Case Study

The description of the temporal evolution presented in section G shows how complex low-level multimodal

interaction can be. When it comes to testing the usability of such behavior it is required first to provide a detailed

description of the behavior to the evaluators but also to make it possible to modify such behavior if the results of the

usability testing require to do so. Some of the modifications in the ICO model of Figure 8 are trivial:

• Changing the value of increase and decrease of time when speech commands are issued: this can be done by

changing the line delay=old+50 in the transition slower_ for instance to another amount of increase

• Changing the maximum speed of 3D image rendering: this can be done by changing the precondition in

transition slower_ or faster_ to another value than 1000 (maximum) and 100 (minimum).

Other complex behaviors relative to qualitative temporal behaviors can also be represented and thus exploited

during usability tests. For instance as modeled in Figure 8 all the input modalities are available all the time but

another design choice could have been to allow only to use a maximum of 2 input modalities at a time. In such case

these limitation should have been presented with precise details to the user before executing the evaluation

scenarios. Similarly, some scenarios could have been selected with the explicit purpose of evaluating comfort and

cognitive workload induced by this kind of reduction of the interaction space.

This notion of low-level interaction technique can have a significant impact on the results and thus the

interpretation of usability tests results. We are currently in the phase of performing such model-based evaluation on

a real ground segment information treatment system to assess the impact of multimodal interaction techniques on the

ease of use and performance. The goal is also to asses impact of model-based evaluation with respect to more

classical usability evaluation technique for multimodal systems (as the ones presented in section B).

VI. Conclusion

This paper has presented the use of a formal description technique for describing multimodal interactive

applications. Beyond that, we have shown that this formal description technique is also adequate for interaction

techniques and low level interactive components. One of the advantages of using the ICO formal description

technique is that it provides additional benefits with respect to other notations such as Statecharts41. Thanks to its

Petri nets basis, the ICO notations makes it possible to model behavior featuring an infinite number of states (as

states are modeled by a distribution of tokens in the places of the Petri nets). Another advantage of ICOs is that they

allow designers to use verification techniques at design time as has been presented in (Ref. 30). These verification

techniques are of great help for certification purposes. Beyond these software engineering benefits, we have also

shown that this model-based approach can also support the usability evaluation activities that are usually considered

externally from the actual development process. This specific contribution provides a first step for integrating, in a

same development framework, requirements coming both reliability and usability communities.

Acknowledgments

This work is partly funded CNES (National Center of Spatial Studies in France), DGA (French Army Research

Dept.) under contract INTUITION #00.70.624.00.470.75.96, EU via the ADVISES Research Training Network

RTN2-2001-00053 and Network of Excellence ResIST (www.resist-noe.org).


14

References 1 Accot J. Chatty S. and Palanque P. A Formal Description of Low Level Interaction and its Application to Multimodal Interactive Systems, 3rd EUROGRAPHICS workshop on Design, Specification and Verification of Interactive systems, Springer-

Verlag, pp. 92-104, 1996. 2 Accot, J., Zhai, S. Performance evaluation of input devices in trajectory-based tasks: an application of the steering law. Proceedings of the SIGCHI conference on Human factors in computing systems CHI’99. p. 466-472. ACM Press. 3 Ajmone Marsan M.; Balbo G.; Conte C.; Donatelli S., and Franceschinis G. "Modelling with generalized stochastic Petri nets". Wiley; 1995. 4 Bastide R. and Palanque P. A Petri-Net Based Environment for the Design of Event-Driven Interfaces. 16th International Conference on Applications and Theory of Petri Nets, ICATPN'95, Torino, Italy. Lecture Notes in Computer Science, no. 935.

Springer verlag, 1995 p.66-83. 5 Bastide R., and Palanque P. A Visual and Formal Glue Between Application and Interaction. Journal of Visual Language and Computing 10, no. 3 (1999) 6 Bastide R., Navarre D. & Palanque P. A Model-Based Tool for Interactive Prototyping of Highly Interactive Applications,

Proceedings of the ACM SIGCHI 2002 (Extended Abstracts), pp. 516-517, 2002. 7 Bastide R., Navarre D., Palanque P., Schyn A. & Dragicevic P. A Model-Based Approach for Real-Time Embedded Multimodal Systems in Military Aircrafts. 6th ACM International Conference on Multimodal Interfaces (ICMI'04) October 14-

15, 2004 Pennsylvania State University, USA, pp. 245-258. 8 Bederson B., Meyer J. & Good L. Jazz an Extensible Zoomable User Interface Graphics Toolkit in Java, UIST’2000, ACM Symposium on User Interface Software and Technology, pp. 171-180, 2000. 9 Bier E. A., Stone M.C., Pier K., Buxton W. and Derose T. Toolglass and Magic Lenses: The see-through interface. Computer Graphics. T. Kajiya ed.1993, pp. 73-80. 10 Bolt R. & Herranz E. Two-Handed Gesture in Multi-Modal Natural Dialog. Proceedings of the fifth annual ACM symposium on User interface software and technology, ACM Press, California, p 7-14, 1992. 11 Bolt R. Put That There: Voice and Gesture at the Graphics Interface, SIGGRAPH'80, p262-270, 1980. 12 Buxton W. & Myers B. A Study in Two-Handed Input. Proceeding of the ACM CHI, Addison-Wesley, 1986, p 321-326 13 Buxton, W., Myers, B. A. A study in two-handed input. Human Factors in Computing Systems, CHI’86 Conference Proceedings. ACM Press, 1986, 321-326. 14 Chatty S. Extending a Graphical Toolkit for Two-Handed Interaction, Proceedings of the ACM symposium on User Interface Software and Technology, ACM Press, California pp. 195-204, 1994. 15 Cohen P., Johnston M., McGee D., Oviatt S., Pittman J., Smith I.; Chen L., & Clow J. QuickSet: multimodal interaction for distributed applications. Proceedings of the fifth ACM international conference on Multimedia; Seattle, Washington, United

States . ACM Press; 1997: p 31-40. 16 Coutaz J. & Nigay L. A Design Space for Multimodal Systems Concurrent Processing and Data Fusion. ACM Human Factors in Computing Systems conference, INTERCHI'93, pp. 172-178, 1993. 17 Coutaz J., Nigay L., Salber D., Blandford A., May J. & Young R.. Four Easy Pieces for Assessing the Usability of Multimodal in Interaction: the CARE Properties. Interact' 95; Lillehammer, Norway. Chapman & Hall (IFIP); 1995: pp. 115-120. 18 den Os, E., de Koning, N., Jongebloed, H. and Boves. L.: Usability of a Speech-Centric Multimodal Directory Assistance Service. Proc. of the CLASS Workshop on Information Presentation and Natural Multimodal Dialogs, Verona, Italy, 2001, 65-

69. 19 Dillon, R. F., Edey, J. D., Tombaugh, J. W. Measuring the true cost of command selection: techniques and results. Humanc Factos in Computing Systems, CHI’90 Conference Proceedings, ACM Press, 1990, 19-25. 20 Dragicevic P. & Fekete J-D. Input Device Selection and Interaction Configuration with ICON. Proceedings of IHM-HCI 2001, Blandford, A.; Vanderdonckt, J.; Gray, P., (Eds.): People and Computers XV - Interaction without Frontiers, Lille, France,

Springer Verlag, pp. 543-448. 21 Dragicevic P., Navarre D., Palanque P, Schyn A. & Bastide R. Very-High-Fidelity Prototyping for both Presentation and Dialogue Parts of Multimodal Interactive Systems. DSVIS/EHCI 2004 joint conference 11th workshop on Design Specification

and Verification of Interactive Systems and Engineering for HCI, Germany, July 11-13, Lecture Notes in Computer Science n°

3452, Springer Verlag, 2004. 22 Genrich H.. Predicate/Transition Nets. High-Level Petri Nets: Theory and Application. K. Jensen and G. Rozenberg (Eds.), Springer Verlag, pp. 3-43, 1991. 23 Hinckley K., Czerwinski M. and Sinclair M. Interaction and Modelling Techniques for Desktop Two-Handed Input,

Proceedings of ACM UIST, 1998, pp. 49-58. 24 Hinckley, K., Pausch, R., Proffitt, D., Kassel, N. F. Two-handed virtual manipulation. ACM Transactions on Computer-Human Interaction, 5 (3), 1998, 260-302. 25 Jost, M., Haubler, J., Merdes, M., Malaka, R. Multimodal Interaction for pedestrians: an evaluation study. In Proceedings of IUI’2005, ACM Press, San Diego, USA, January 9-12. 26 Kabbash, P., Buxton, W., Sellen, A. Two-handed input in a compound task. Human Factors in Computing System CHI’94 Conference Proceedings, ACM Press, 1994, 417-423. 27 Klein, A., Schwank, I., Généreux, M., Trost, H. Evaluating Multimodal Input Modes in a Wizard-of-Oz Study for the Domain of Web Search. In: Ann Blandford, Jean Vanderdonckt and Phil Gray (eds), People and Conputer XV - Interaction

without Frontiers: Joint Proceedings of HCI 2001 and IHM 2001, pp. 475-483. Springer: London, September.


15

28 Lewis, C., Polson, P. Wharton, R. Testing a walkthrough methodology for theory-based design of walk-up-and-us interfaces. In Proceedings of CHI90, ACM Press, 1990, pp. 235-241. 29 Massaro, D. W. A Framework for evaluating Multimodal integration by humans and a role for embodied conversational agents. In Proceedings of ICMI’2004, October 13-15 2004. Pennsylvania, USA. 30 Navarre D., Palanque P. & Bastide R. Reconciling Safety and Usability Concerns through Formal Specification-based Development Process HCI-Aero'02 MIT, USA, 23-25 October, 2002. p. 168-175. 31 Navarre D., Palanque P., Dragicevic P. & Bastide R. An Approach Integrating two Complementary Model-based Environments for the Construction of Multimodal Interactive Applications. Interacting with Computers, vol. 17, n°3 (to appear),

2006. 32 Nedel, L.; Freitas, C.M.D.S.; Jacob, L.; Pi, M.. Testing the Use of Egocentric Interactive Techniques in Immersive Virtual Environments. In proceedings of Ninth IFIP TC13 International Conference on Human-Computer Interaction, 2003,

INTERACT´03. Amsterdam: IOS Press, 2003. p. 471-478. 33 Nielsen, J., Mack, R. (eds.) Usability Inspection Methods (New York: Wiley) 25-62. 1994 34 Nigay L. & Coutaz J. A Generic Platform for Addressing the Multimodal Challenge. Human Factors In Computing Systems CHI'95 Conference Proceeding; Denver Colorado USA. 1995: p 98-105. 35 OMG. The Common Object Request Broker: Architecture and Specification. In CORBA IIOP 2.2.Framingham 1998 36 Ould M., Palanque P., Schyn A., Bastide R., Navarre D., Rubio F. Multimodal and 3D Graphic Man Machine Interfaces to improve Operations. SpaceOps 2004, AAIA, Spring 2004, Montreal, Canada. 37 Oviatt S. Ten myths of Multimodal Interaction. Communication of the ACM. 1999; 42(11):74-81. 38 Panttaja, E. M., Reitter, D., Cummins, F. The Evaluation of Adaptable Multimodal Systems Outputs. In Proceedings of the DUMAS Workshop on Robust and Adaptive Information Processing for Mobile Speech Interface. 2004, Geneva Switzerland. 39 Paternò, F., Santos, I. Designing and Developing Multi-user, Multi-device Web Interfaces. In proceedings of CADUI 2006 (to appear), Springer Verlag, Bucharest, Romania, June 5-8, 2006. 40 Reason J. Human Error, Cambridge University Press, 1990. 41 Sherry L., Polson P., Feary M. & Palmer E. When Does the MCDU Interface Work Well? Lessons Learned for the Design of New Flightdeck User-Interface. In proceedings of HCI Aero 2002, AAAI Press, pp. 180-186. 42 Suhm, B., Myers, B., Waibel, A. Model-based and Empirical Evaluation of Multimodal Interactive Error Correction. In Proc. CHI99, Pittsburgh, USA, 15-20 May 1999. pp. 584-591. 43 Sy O., Bastide R., Palanque P., Le, D-H and Navarre D. PetShop: a CASE Tool for the Petri Net Based Specification and Prototyping of CORBA Systems. 20th International Conference on Applications and Theory of Petri Nets, ICATPN'99, springer

Verlag, p. 145-172. 44 Vo M.T. & Wood C. Building an Application Framework for Speech and Pen Input Integration in Multimodal Learging Interface, Proceedings of the IEEE International Conference on Acoustics, Speech, and Signal Processing, 1996, Vol 6, pp.

3545-3548. 45 Yee K-P. Two handed interaction on a tablet display. Late Breaking Results, ACM CHI 2004 conference, Vienna Austria, 2004, pp. 1493-1496. 46 Zhai, S., Barton, A. S., Selker, T. Improving browsing performance: a study of four input devices for scrolling and pointing tasks. Proccedings of INTERACT’97: The IFIP Conference on Human-Computer Interaction, 1997, 286-292.

Documents

Supporting Usability Evaluation of Multimodal Man- Machine ...Marco.Winckler/publications/2006-SPACEOPS.pdf · can support the usability evaluation phase. As a case study we present