Evaluation of Interactive Systems Design or Prototype or Product

Evaluation of Interactive Systems Design/Prototype/Product

Md. Saifuddin KhalidAssistant Professor

KANDIDATUDDANNELSEN I INFORMATIONSTEKNOLOGI, IT OG LÆRING,

MED SPECIALISERING I ORGANISATORISK OMSTILLINGLocation of course: 8. semester

Course’s Scope: 5 ECTSAalborg University, Aalborg, Denmark

Spring 2017ID6-L1

Aalborg University

Aims• Evaluation is the fourth main phase of the interactive systems

design process• Evaluation means reviewing, trying out or testing a design idea, a

piece of software, a product or a service to discover whether it meets some criteria.

• After studying Benyon, (2010, chapter 10) and attending today’s session you should be able to:– Appreciate the uses of a range of generally applicable evaluation

techniques designed for use with and without users1. Understand expert-based evaluation methods2. Understand participant-based evaluation methods3. Apply the techniques in appropriate contexts.

Monday, 27 March 2017

Aalborg University

Categorization of design and systems evaluation methods

• User-focused– Analytical (without user) and – Empirical (with user) Evaluation

• Product creation process (PCP) oriented– Exploratory (before design or after release),– Predictive (after design and before implementation)– Formative (during design and implementation) and – Summative (after implementation) Evaluation

• Expert, participant-based and context-appropriated methods


Aalborg University

Usability Engineering

Lifecycle


Source: http://keithandrews.com/talks/2012/dd-2012-03-27/#(2)

1. Design-test-redesign2. Design versus evaluation

Aalborg UniversityMonday, 27 March 2017

Source: http://keithandrews.com/talks/2012/dd-2012-03-27/#(5)See also, for details http://user.medunigraz.at/andreas.holzinger/holzinger%20de/usability%20holzinger.html#Action Analysis (AA)

http://keithandrews.com/talks/2012/dd-2012-03-27/#(5)




http://user.medunigraz.at/andreas.holzinger/holzinger%20de/usability%20holzinger.html#Action%20Analysis%20(AA)



Aalborg University

Analytic Evaluation (without users)and Empirical Evaluation (with users)

• Analytic method– For example, evaluation of an axe’s characteristics (e.g. design of the

bit, the weight distribution, the steel alloy used, etc.)• Empirical method– studying a good axe man as he uses the tool

Monday, 27 March 2017Rosson, M. B., & Carroll, J. M. (2002).

Aalborg University

Formative Evaluation (during design)and Summative Evaluation (at the end)

• Formative evaluation takes place during the design process.– Goals: to identify aspects of a design that can be improved, to set priorities,

and in general to provide guidance in howto make changes to a design.– A typical formative evaluation would be to ask a user to think out loud as he

or she attempts a series of realistic tasks with a prototype system.• Summative evaluation happen at the end of a development process – Goals: to answer — “Does the system meet its specified goals?” or “Is this

system better than its predecessors and competitors?”– Most likely to happen at the end of a development process when the

system is tested to see if it has met its usability objectives.– can also take place at critical points during development to determine how

close the system is to meeting its objective, or to decide whether and how much additional resources to assign to a project.

Rosson, M. B., & Carroll, J. M. (2002). Monday, 27 March 2017

Aalborg University

ISO 9241-210: Human-centred design for interactive systems

Monday, 27 March 2017Source: http://uxlabs.co.uk/services/; adapted – Red-colored boxes

PACT Analysis

Models: Use cases, Rich picture, User stories

Rogers et al. (2013) and ISO 9241-210 phases are not same!

http://uxlabs.co.uk/services/

http://uxlabs.co.uk/services/

Aalborg University

Evaluation is closely tied to …• key activities of interactive

systems design, understanding, design and envisionment.

• who is involved in the evaluation– Expert-based methods (what

type of expert? Usability expert or interaction designer)

– Participant/End-User methods (what type? Target user, other designers, or students or others)


Consider an exercise: Collect several advertisements for small, personal technologies such as that shown in Figure 10-1. What claims are the advertisers making about design features and benefits? What issues does this raise for their evaluation?

Source: Benyon, 2010, p. 226

Aalborg University

The four roles that children may have in the design of new technologies


Druin, A. (2002). The role of children in the design of new technology. Behaviour & Information Technology, 21(1), 1–25. http://doi.org/10.1080/01449290110108659

Aalborg University

EXPERT EVALUATIONFormal Heuristic and informal expert review


Aalborg University

Expert Evaluation: Heuristic evaluation• Heuristic: enabling a person to

discover or learn something for themselves.

• “Heuristic evaluation refers to a number of methods in which a person trained in HCI and interaction design examines a proposed design to see how it measures up against a list of principles, guidelines or ‘heuristics’ for good design. This review [of interfaces] may be a quick discussion over the shoulder of a colleague, or may be a formal, carefully documented process.”


List of the design principles – or heuristics [ a specific rule-of-thumb or argument derived from experience]:1. Visibility 2. Consistency3. Familiarity4. Affordance5. Navigation6. Control7. Feedback8. Recovery9. Constraints10. Flexibility11. Style12. Conviviality

Aalborg University

Expert Evaluation: Discount Usability/Heuristic Evaluation

• Three overarching usability principles– learnability (principles 1–4),– effectiveness (principles 5–9)– accommodation (principles 10–12).

• A ‘quick and dirty’ approach, for time-pressured evaluation practitioners in need of feedback

• “Woolrych and Cockton (2000) conclude that the heuristics add little advantage to an expert evaluation and the results of applying them may be counter-productive.”

• “They (and other authors) suggest that more theoretically informed techniques such as the cognitive walkthrough offer more robust support for problem identification.”


Aalborg University

Heuristic evaluation (cont.)• Heuristic evaluation is valuable as formative evaluation, to help the

designer improve the interaction at an early stage. • It should not be used as a summative assessment, to make claims about

the usability and other characteristics of a finished product. • If that is what we need to do, then we must carry out properly designed

and controlled experiments with a much greater number of participants.• Ecological validity: The results of most user testing can only ever be

indicative of issues in real-life usage due to human nature of adapting technology in ways that was not designed for. So,– Ethnographically informed observations of technologies in long-term

use– Having users keep diaries, which can be audio-visual as well as written– Collecting ‘bug’ reports – often these are usability problems – and

help centre queries.


Aalborg University

Cognitive walkthrough• The cognitive walkthrough entails a usability analyst stepping through the

cognitive tasks that must be carried out in interacting with technology.• Inputs to the process are:

– An understanding of the people who are expected to use the system– A set of concrete scenarios representing both (a) very common and (b)

uncommon but critical sequences of activities– A complete description of the interface to the system - hierarchical

task analysis (HTA).• The ‘cognitive jogthrough’ (Rowley and Rhoades, 1992) – video records

(rather than conventional minutes) are made of walkthrough meetings, annotated to indicate significant items of interest, design suggestions are permitted, and low level actions are aggregated wherever possible.

• The cognitive walkthrough is very often practiced (and taught) as a technique executed by the analyst alone, to be followed in some cases by a meeting with the design team.


Aalborg University

PARTICIPANT-BASED EVALUATIONUsability Evaluation


Aalborg University

Cooperative evaluation• The technique is ‘cooperative’ because participants are not passive

subjects but work as co-evaluators.


Aalborg University

Cooperative evaluation (cont.)


Sample questions during the evaluation: What do you want to do? What were you expecting to happen? What is the system telling you? Why has the system done that? What are you doing now?

Sample questions after the session: What was the best/worst thing about the prototype? What most needs changing? How easy were the tasks? How realistic were the tasks? Did giving a commentary distract you?

Aalborg University

Participatory heuristic evaluation• The developers of participatory heuristic evaluation (Muller et al.,

1998) claim that it extends the power of heuristic evaluation without adding greatly to the effort required.

• The procedure for the use of participatory heuristic evaluation is just as for the expert version, but the participants are involved as ‘work-domain experts’ alongside usability experts and must be briefed about what is required.


Aalborg University

Co-discovery• Co-discovery is a naturalistic, informal

technique that is particularly good for capturing first impressions. It is best used in the later stages of design.

• Watching individual people interacting with the technology, and possibly ‘thinking aloud’ as they do so, can be varied by having participants explore new technology in pairs.

• Depending on the data to be collected, the evaluator can take an active part in the session by asking questions or suggesting activities, or simply monitor the interaction either live or using a video-recording.


Figure. Catroid Co-discovery test

source:%20http://keithandrews.com/talks/2012/dd-2012-03-27/#(15)



Aalborg University

Controlled experiments• Controlled experiments are appropriate where the designer is

interested in particular features of a design, perhaps comparing one design to another to see which is better.

• Identify independent and dependent variables, and design decision.• E.g. You might want to judge which Web design is better based on

the number of clicks needed to achieve some task; speed of access could be the dependent variable for selecting a function.

• confounding variables - learning effects, the effects of different tasks, the effects of different background knowledge, etc.

• Considering participants’ differences, the next stage is to decide whether each participant will participate in all conditions (so-called within-subject design) or whether each participant will perform in only one condition (so-called between-subject design).


Aalborg University

EVALUATION IN PRACTICEThe trend in current practice


Aalborg University

Trend

• A survey of 103 experienced practitioners of human-centred design conducted in 2000 indicates that– around 40 per cent of those surveyed conducted

‘usability evaluation’,– around 30 per cent used ‘informal expert review’– around 15 per cent used ‘formal heuristic

evaluation’• What is the current trend?


Aalborg University

Main steps of evaluation project/phase

1. Establish the aims of the evaluation, the intended participants in the evaluation, the context of use and the state of the technology; obtain or construct scenarios illustrating how the application will be used.

2. Select evaluation methods. These should be a combination of expert-based review methods and participant methods.

3. Carry out expert review.4. Plan participant testing; use the results of the expert review to

help focus this.5. Recruit people and organize testing venue and equipment.6. Carry out the evaluation.7. Analyse results, document and report back to designers.


Aalborg University

Perceived costs and benefits of evaluation methods


Figure: A survey of user-centred design practice (Benyon, 2010, p. 237, cited Vredenburg, K., Mao, J.-Y., Smith, P.W. and Carey, T. , 2002)

Aalborg University

Metrics (and measures)• Three things to keep in mind when deciding metrics:

– Just because something can be measured, it doesn’t mean it should be.

– Always refer back to the overall purpose and context of use of the technology.

– Consider the usefulness of the data you are likely to obtain against the resources it will take to test against the metrics.


Aalborg UniversityMonday, 27 March 2017

Aalborg University

People who will use the system• Nielsen’s recommended sample of 3–5 participants has been

accepted wisdom in usability practice for over a decade.• For heterogeneous set of customers, run 3–5 people from each

group through your tests.• If you cannot recruit any genuine participants then use convenient

‘user recruitment’ - one of your colleagues, a friend, your mother or anyone you trust to give you a brutally honest reaction. But, be extremely careful as to how far you generalize from your findings.


Aalborg University

The test plan and task specification• A plan should be drawn up to guide the evaluation.

– Aims of the test session– Practical details, including where and when it will be conducted,

how long each session will last, the specification of equipment and materials for testing and data collection, and any technical support that may be necessary

– Numbers and types of participant– Tasks to be performed, with a definition of successful

completion. This section also specifies what data should be collected and how it will be analysed.

• Conduct a pilot session and fix any unforeseen difficulties


Aalborg University

Reporting usability evaluation results to the design team

• The report should be ordered either by areas of the system concerned, or by severity of problem.

• A face-to-face meeting may have more impact than a written document alone (although this should always be produced as supporting material) and this would be the ideal venue for showing short video clips of participant problems.

• Usability problems can be fed into a ‘bug’ reporting system if one exists.

• The can be turned into user stories as part of spring backlog of Scrum development approach.


Aalborg University

EVALUATION: FURTHER ISSUESThe mix category


Aalborg University

Evaluation without being there• Internet connectivity enabled evaluations without being physically

present.• If the application itself is Web-based, or can be installed remotely,

instructions can be supplied so that users can run test tasks and fill in and return questionnaires in soft or hard copy.

• On-line questionnaires and crowd sourcing methods are appropriate here (See, Benyon, 2010, Chapter 7).


Aalborg University

Physical and physiological measures: Evidence of emotional reactions

• Eye-movement tracking (or ‘eye tracking’) can show participants’ changing focus on different areas of the screen.

• Physiological techniques in evaluation rely on the fact that all our emotions – The most common measures are of changes in heart rate, the rate of respiration, skin temperature, blood volume pulse and galvanic skin response (an indicator of the amount of perspiration).


• Body-connected sensors are linked to software which converts the results to numerical and graphical formats for analysis

Aalborg University

Evaluating presence – in virtual reality

• Designers of virtual reality – and some multimedia – applications are often concerned with the sense of presence, of being ‘there’ in the virtual environment rather than ‘here’ in the room where the technology is being used.

• Methods:– Questionnaire– Written accounts of experience/interview– Observation in virtual environment– Direct physiological measures


Aalborg University

Evaluation at home• Issues: Privacy, time

and motivation• Methods

– Interviews– Act out scenarios– Diaries

“working with children is a good way of drawing parents into evaluation activities.”


The investigator supplied users with Post-itsto capture their thoughts about design concepts (Figure 10.5). An illustration of each different concept was left in the home in a location where it might beused, and users were encouraged to think about how they would use the device and any issues that might arise. These were noted on the Post-its, which werethen stuck to the illustration and collected later.

Aalborg University

Questions and clarifications


Aalborg University

References• Druin, A. (2002). The role of children in the design of new technology. Behaviour & Information

Technology, 21(1), 1–25. http://doi.org/10.1080/01449290110108659• Rosson, M. B., & Carroll, J. M. (2002). Usability engineering: scenario-based development of human-

computer interaction (1st ed.). San Fancisco: Academic Press.