Evaluation Guideelines and Methods

Human-Computer Systems

MM4HCI

2013

Lecture 3 – Evaluation methods and

guidelines

Professor Sarah Sharples

Department of Mechanical, Materials and

Manufacturing Engineering

AN EVALUATION

FRAMEWORK

Outline

1. Understand what evaluation is for

2. Preparing for an evaluation

3. The range of evaluation techniques and their uses

4. Understanding some of the practical issues of applying evaluation methods

Main reading - Sharp et al., Chapter 12, 14, 15

What is evaluation?

• Involving users, and user

representatives, in the technology / ICT

design and development process in a

structured manner

• Capturing responses to a design or a

design artefact

• Can be carried out at any point in the

development process

Usability

goals

Efficient to use

Effective to use

Safe to use

Have good utility

Easy to learn

Easy to remember how to use

fun

emotionally fulfilling

rewarding

supportive of creativity

aesthetically pleasing motivating

helpful

entertaining

enjoyable

satisfying

Source: Preece et al., 2002

Evaluation choice considerations

Why

• Why are you conducting the evaluation?

What • What do you have to evaluate (eg prototypes)?

Who • Who is going to help you (users, experts)?

When • When in the development process?

Where • Do you need a „clean‟ environment, or context?

How • What method are you going to use?

Why evaluate?

• Ensure a „user-centred design‟

• Easy to learn, easy to use, efficient, useful, satisfying to use

• From a human factors perspective

• Safe (for the operator), safe for the system, optimal system

performance (Holnagel and Woods, 2006)

• Inform and evolve the design (saves time and

money); verify requirements (Chevalier and Kicka,

2006)

• Benchmarking and comparison

What data do you need to capture?

Satisfaction

Ease of learning

Usability

Performance /

efficiency

What have you got to evaluate?

Lo-fi Hi-fi

Benefits • Cheap

• Addresses layout

• Proof-of-concept

• Open to participatory design

and comment (Erickson,

1995)

• Complete functionality

• Supports quantitive evaluation

(eg users error rates)

• Marketing and sales tool

• A living specification

Drawbacks • Navigation and flow

limitations for evaluation

• Does not support good

quantitative measures (eg

errors)

• Expensive

• Time consuming

• Perceived limited scope for

change

Best used early on or for

rapid re-designs

Best used for quantitative user-

evaluation, and as part of proofs

of concept crossing business

functions

Who is going to be involved?

• Do you need to match against certain

characteristics?

• Age, gender, education, prior knowledge

• Physical, cognitive and attitudinal implications

• Do any of your users pose particular

challenges?

• Older adults, children, children with special needs

• Can you use novices, or HCI experts?

• And how many (depends on method)

When in the development process?

Requirements

Effort

Concept Design and Development Implementation Deployment

Evaluation

Last

minute

panic

testing!!!

Formative vs summative

• Formative

• To inform the design process

• Explorative, using partially completed artefacts

(prototypes)

• Maybe more qualitative or subjective

• Summative

• A confirmation exercise

• To ensure meets intended aims

• Often against a recognised standard or set of

benchmarks (or initial requirements)

Lab

Simulation

Real world

Where? (see Duh et al, 2005)

http://www.ship-technology.com/projects/mariner/

Evaluation as part of user experience

User Experience

Tech-nology

Users

Tasks

Context

•Is the technology new?

•Is there a novel input or

output?

•How much does the

technology influence the

interaction?

•Are the users experts or have

prior knowledge?

•Do they have specific

characteristics?

• Is „how‟ they do it important?

• Is performance relevant?

• Are you investigating

functionality?

• Does where they use the

technology influence the

interaction?

•Social or physical factors

•Temporality – short or long

periods of use

EVALUATION METHODS

Evaluation Approaches

Read Sharp, Rogers & Preece, 2007

Chapters 14 & 15 for more information

Analytical

• Predictive evaluation methods

Field study

• Interpretive evaluation methods

• Collecting users‟ opinions

Lab study

• Experiments and benchmarking

• Usability studies

Analytical - Predictive evaluation

HCI experts use their knowledge of users and

technology to evaluate interface usability

• Inspection methods and heuristics

• Accessibility (WCAG, 1999)

• User modelling – GOMS and KLM

• Walkthroughs

Analytical - Heuristic evaluation

• ~ 5 HCI experts work independently

• General review of product

• Focus on specific features

• Structured expert reviewing against guidelines, e.g.

• “use simple and natural language”

• “provide shortcuts”

• Collate reviews to prioritise problems

• Five HCI experts typically find c.75% of usability

problems of an interface

• BUT see Cockton and Woolrych, 2002

Analytical - Walkthroughs

• Cognitive walkthrough focus on ease of learning

• Scenario-based evaluation

• 3 main questions:

• Will the correct action be evident to the user?

• Will the user notice that the correct action is available?

• Will the user associate and interpret the response from the

action correctly?

• Pluralistic walkthrough (experts, experts + users)

• Participatory design

Analytical evaluation

Advantages Disadvantages

Experienced reviewers

Can be difficult and

expensive to find experts

Users not involved Experts may have biases

Good experts will have

knowledge of users

Some problems may get

missed, trivial problems

identified

Easy to set up and run study

Field study - Interpretive evaluation

• Aims to enable designers to understand better how

users use systems in context

• Qualitative data

• Description of performance/outcome

Field study - Data collection

• Informal and naturalistic methods of data collection

• Observations, interviews, usage logging, focus groups

• Contextual Inquiry

• Originates from ethnography

• Observe the entire process of interface use, from switching

on computer to going home after task completion

• Co-operative and participative evaluation

• Focus groups

• Development of prototypes

• Iterative design process

Field study - Interviews vs focus

groups? • Do you want opinions or actual tasks?

• Can get error / timing / task data from FGs

• Are users familiar enough to remember

useage?

• Do you have something to „focus‟ on?

• Focus groups need careful planning and

careful facilitation

• See Nielsen, 2001b

Field study methods


Reveals what really happens

in context of use

May not be easy to recruit

participants

Description of performance

or outcome

Can be disruptive to the

working environment

Users directly involved True ethnographic studies

require evaluator expertise

Works well for formative

evaluation of prototypes

Quality of results variable

Lab study - Experiments and

benchmarking • Traditional approach to HCI

• Predicted relationship between variables

• Manipulate Independent Variable (IV), measure

Dependent Variables (DV)

• Generally use time/error measurement

• Specific Human Factors measures

• Workload – Nasa TLX

• Situation Awareness – SAGAT

• Body Part Discomfort

Lab study - Usability testing

• An essential part of the evaluation process

• Structured interview and activity

• Observed and recorded (eye tracking, facial

expressions, comments)

• Tends to be summative, towards the end of the

process

• At very least needs interactive prototype

• Can then be backed up with a survey eg SUS

Lab study methods


Studies conducted under

controlled conditions

Requires lab facilities and

resources

Experiments provide

quantitative measures

May require experimenter

expertise

Focus on specific aspects of

design or user performance

Can be time consuming and

expensive

Usability testing provides

qualitative results

Unnatural setting may affect

user behaviour

Highlights particular usability

problems

Unrealistic tasks may not

inform design

EVALUATION IN PRACTICE

DECIDE: a framework to guide

evaluation (Preece, Rogers and Sharp, 2002. Chapter 11)

• Determine the goals

• Explore the questions

• Choose the evaluation approach and methods

• Identify the practical issues

• Decide how to deal with the ethical issues

• Evaluate, analyze, interpret and present the data

Applying methods across a project

Effort

Concept Design and Development Implementation Deployment

Live field trials

Travel

application

concepts

Indoor

navigation

prototype

testing

Presentation

of

privacy

information

Lab usability

study

Practical issues

• Selection and recruitment of participants

• Number of participants

• Find evaluators

• Control over environment, study set-up

• Equipment

• Budget constraints

• Schedule/deadline

• Managing the session • Stepping back in interviews and focus groups

Ethical issues

• Develop an informed consent form

• Participants have a right to:

- Know the goals of the study

- Know what will happen to the findings

- Privacy of personal information

- Leave when they wish

- Be treated politely

Example evaluation exercise

You are required to propose an evaluation

programme to support the design of new voice

technologies to help older adults interact with

objects (e.g. furniture, electrical appliance) in

their homes

Summary (1)

There are many issues to consider before conducting an

evaluation study

These include the goals of the study, the approaches and

methods to use, practical issues, ethical issues, and how

the data will be collected, analysed and presented

Evaluation & design are closely integrated in user-

centered design

References

• Cockton, G., & Woolrych, A. (2002) Sale must end: should discount methods be cleared off HCI‟s shelves? Interactions, 9 (5), 13-18.

• Chevalier, A., & Kicka, M. (2006) Web designers and web users: Influence of the ergonomics quality of the web site on the information search. International Journal of Human-Computer Studies, 64 (10), 1031-1048.

• Duh, H. B-L., Tan, G. C. B., & Chen, V. H. (2005) Usability evaluation for mobile device: a comparison of laboratory and field tests. In Proceedings of the 8th conference on Human-computer interaction with mobile devices and services. pp 181-186. New York, NY.: ACM Press.

• Erickson, T. (1995) Notes on design practice: stories and prototypes as catalysts for communication. In J. M. Carroll (Ed.) Scenario-based design: Envisioning work technology in system development pp. 37-58. New York, NY: John Wiley & Sons

• NIELSEN, J. (2000a). The use and misuse of focus groups. http://www.useit.com/papers.

• Nielsen‟s Ten usability Heuristics (2001). retrieved from www.useit.com.

• Sharp, H., Rogers, Y. and Preece, J. (2007). Interaction Design, Beyond human-computer interaction (2nd edition). John Wiley and Sons:NY. Chapters 12, 13, 14 & 15.

• Shneiderman, B. (1998). Designing the User Interface (3rd edition). Addison-Wesley:MA.

• Standard Usability Measurement Inventory (SUMI) retrieved from http://sumi.ucc.ie/index.html January 2008.

• WCAG (1999) Web Content Accessibility Guidelines 1.0. Retrieved 28th Feb 2008, from http://www.w3.org/TR/1999/WAI-WEBCONTENT-19990505/

http://www.useit.com/papers

http://www.useit.com/

http://sumi.ucc.ie/index.html January 2008

http://sumi.ucc.ie/index.html January 2008

Summary (2)

Different evaluation approaches and methods are often combined in one study

Triangulation involves using a combination of techniques to gain different perspectives, or analysing data using different techniques

Dealing with constraints is an important skill for evaluators to

develop

Documents

Evaluation Guideelines and Methods