Usability - uio.no

USABILITY TESTING

IN2020 Methods in Interaction Design, Autumn 2018

Chapter 10, presented by Ines Junge, PhD candidate

OVERVIEW FOR TODAY

• Chapter 10What we test with, in usability testing

• Types of usability testing

• User-based testing

• Stages, amount of users, locations, tasks

(connections to ch.“Measuring the human” &

“Automated Data Collection”).

• Usability testing versus traditional research

WARM-UP EXERCISE

Two students together:

One student has forgotten a book on a bus and should report this to Ruter

The other student shall measure the time, evaluate the solution and find out if this was difficult…

WHAT IS USABILITY?

• "The extent to which a product can be used by

specified users to achieve specified goals with

effectiveness, efficiency, and satisfaction in a

specified context of use." (ISO 9241-11)

USABILITY DIMENSIONS

Effectiveness –Can users complete tasks, achieve goals with the product?

01Efficiency –How much effort (time) do users require to do this?

02Satisfaction –What do users think about the product's ease of use?

03

… WHICH ARE AFFECTED BY

The users: Who uses the product: Highly trained/ experienced users, or novices?

01Their goals: What users try to do with the product; inhowfarwhat is to do supported?

02The usage situation (or 'context of use’): Where and how the product is used

03

USABILITY CONTINUED

• Usability is composed of:– Learnability: How easy it is for users to accomplish

basic tasks the first time they encounter the design.

– Efficiency: Once users have learned the design,how quickly they can perform tasks.

– Memorability:When users return to the design aftera period of not using it, how easily they can re-establish proficiency.

– Errors: How many errors users make, how severe theseare, and how easily users can recover from the errors.

– Satisfaction: How pleasant it is to use the design.

(Nielsen,1993 )

WHAT IS USABILITY TESTING?

• Involving representative users, representative tasks,

representative environments

– Testing paper prototypes

Example Usability Test

with a Paper Prototypehttps://www.youtube.com/watch?v=9wQkLthhHKA

Picture credits: https://s3.amazonaws.com/digitalgov/_legacy-img/2014/01/Prototyping-workshop-presentataion.pdf

https://www.youtube.com/watch?v=9wQkLthhHKA

https://s3.amazonaws.com/digitalgov/_legacy-img/2014/01/Prototyping-workshop-presentataion.pdf




– Wizard of Oz Wizard Of Oz Prototype

Walkthrough - Self-Service Checkout

Kiosk https://www.youtube.com/watch?v=d8SEFSpBYzY


https://www.youtube.com/watch?v=d8SEFSpBYzY




– Wizard of Oz

– Testing working version of a system before it is

released

– Testing working system

– Testing on different devices (smart phones,

laptops…)


• Improve the quality of an interface by finding flaws

in it

– cause problems for the majority of people

(not preferences)

• What works fine (keep it)

Flaws that cause problems for the majority of people

GOAL OF USABILITY TESTING?

10

https://public-media.interaction-

design.org/images/uploads/15606bb71fc51aac745d3

1fe2de6f8d1.gif

https://public-

media.interaction-

design.org/images/uploads/77

3821ff30b24bce0d471289139

64770.gif

WHAT IS “WRONG” HERE?

https://public-media.interaction-design.org/images/uploads/15606bb71fc51aac745d31fe2de6f8d1.gif

https://public-media.interaction-design.org/images/uploads/773821ff30b24bce0d47128913964770.gif

10

Flaws that cause problems for the majority of people

Think aloud with your classmate next to you and show

on the screen to the whole class in a mini-exhibition

afterwards! (5 min.)

Information overload? Mysterious navigation?

Added friction to user actions? “Clever” design

that ignores usability?

WHAT HAVE YOU ENCOUNTERED AS "WRONG"?

EXAMPLE

Alma 2016-55

• Yotaphone

• Settings of the data limit/1 GB warning

• Turnwheel with numbers (every single MB)

EXAMPLE

Alma 2016-55

• Yotaphone

• Settings of the data limit/1 GB warning

• Instead of turnwheel: Line for the limit movable in a graph

• Usability testing = general term:

– next to a total of 3 users, watching as they attempt

different tasks

– hundreds of users testing interfaces, in different

treatment groups

Reality: usually somewhere in between

• The best usability testing is the one that actually

takes place

– Be flexible, be reasonable, be practical

SCOPE OF USABILITY TESTING

Activities aiming to improve the ease

of use of an interface

• Expert-based testing

(usability inspection)

• Automated testing

(usability inspection)

• User-based testing

(usability testing)

USABILITY ENGINEERING

https://www.360logica.com/blog/effective-software-testing-tips-testers-go/

EXPERT-BASED TESTING

• Structured inspections done by interface experts

• Before tests with users

• Find!: Obvious flaws, confusing wording, inconsistent

layout etc.

❖ Heuristic review

Compare interface with the rules

❖ Consistency inspections

Series of screens or web pages inspected

❖ Cognitive walkthrough

Experts perform the tasks (high-frequency andimportant/seldom)

❖ Guidelines review

Web Content Accessibility Guidelines

HEURISTIC EVALUATION

• Enables designers to evaluate without users

• Economical technique to identify usability issues

early in the design process

➢ Briefing

…

➢ Evaluation period

…

➢ Debriefing

…

• teach to evaluators; ensure each person receives same briefing.

• become familiar with the UI and domain

Briefing

• compare UI against heuristics

• spend 1-2 hours with interface; minimal 2 interface passes

• take notes

Evaluation period

• Prioritize problems; rate severity

• aggregate results

• discuss outcomes with design/ development team

Debriefing

Jakob NielsenSatisfactionEfficiencyLearnabilityLow ErrorsMemorability

Ben ShneidermanEase of learningSpeed of task completionLow error rateRetention of knowledge over timeUser satisfaction

invented several usability methods,

including heuristic evaluation. He holds

79 United States patents, mainly on

ways of making the Internet easier to

use.

Jakob Nielsen has been called:

"the king of usability" (Internet

Magazine)

(https://www.nngroup.com/

people/jakob-nielsen/)

https://www.nngroup.com/people/jakob-nielsen/

Good Example:

Ring/Silent switch

htt

ps:

//fa

culty.

was

hin

gton.e

du/jte

nenbg/

cours

es/

360/f0

4/s

ess

ions/

schneid

erm

anG

old

enR

ule

s.htm

l

https://faculty.washington.edu/jtenenbg/courses/360/f04/sessions/schneidermanGoldenRules.html

1. Validity of system status

- Are users kept informed about what is going on?

- Is appropriate feedback provided within reasonable time about a user’s action?

2. Match between system and the real world

- Is the language used at the interface simple?

- Are the words, phrases and concepts used familiar to the user?

3. User control and freedom

- Are there ways of allowing users to easily escape from places they unexpectedly

find themselves in?

4. Consistency and standards

- Are the ways of performing similar actions consistent?

5. Help users recognize, diagnose, and recover from errors

- Are user messages helpful?

- Do they use plain language to describe the nature of the problem and suggest

a way of solving it?

6. Error prevention

- Is it easy to make errors?

- If so, where and why?

7. Recognition rather than recall

- Are objects, actions and options always visible?

8. Flexibility and efficiency of use

- Have accelerators (i.e. shortcuts) been provided that allow more experienced users to

carry out tasks more quickly?

9. Aesthetic and minimalist design

- Is any unnecessary and irrelevant information provided?

10. Help and documentation

- Is help information provided that can be easily searched and easily followed?

Ten U

sabili

ty H

euri

stic

s by

Jako

b N

iels

en

htt

p://w

ww

.use

it.c

om

/pap

ers

/heuri

stic

/heuri

stic

_lis

t.htm

l

htt

ps:

//tf

a.st

anfo

rd.e

du/d

ow

nlo

ad/T

enU

sabili

tyH

euri

stic

s.pdf

https://tfa.stanford.edu/download/TenUsabilityHeuristics.pdf

SEVERITY RATINGS

0 – this is not a usability problem

1 – cosmetic problem only

2 – minor usability problem

3 – major usability problem

4 – usability catastrophe; imperative to fix

Combination of frequency and impact

PROS AND CONS OF HE

• Pro

– Very cost effective

– Identifies many usability issues

• Contra

– relies on interpretation of guidelines

- tends to uncover many low-severity problems

– guidelines may be too generic

– needs more than one evaluator to be effective

- additional practical problem: multiple usability

experts more expensive and difficult to find as it

is to test 3-5 users

COULD-BE EXERCISE

– Prepare heuristic evaluation

with 8 golden rules

– Conduct heuristic evaluation

• Cognitive

walkthrough

involves simulating a

user’s problem-solving

process at each step in

the human- computer

dialog, checking to see

if the user’s goals and

memory for actions can

be assumed to lead to

the next correct action.

– Nielsen and

Mack, 1994

WALKTHROUGH BASICS

• Imagine how well a user could perform tasks

with your low-fidelity prototype

• Manipulate prototype as you go

– evaluate choice-points in the interface

– evaluate labels or options

– evaluate likely user navigation errors

• Revise prototype and perform again

WHY WE USE CW

• Cognitive walkthrough enables a designer to evaluate

an interface without users– a designer attempts to see the interface from the perspective of a user

• Low-investment technique to identify task-related

usability issues early on– no implementation or users required

– can be performed on existing interfaces

• Identify task-related problems before

implementation– invest a little now, save a lot later

• Enables rapid iteration early in design– can do several evaluations of trouble points

• Evaluations are only effective if your team– has the right skill set

– wants to improve the design, not defend it

WHY WE USE CW CONTINUED

Once one

• knows who the users are

• has a low-fidelity (paper) prototype of the

interface with enough “functionality” for

several tasks

• has task descriptions

• has scenarios designed to complete the

task

• has an evaluation team ready

WHEN TO DO/WHAT IS NEEDED

• Tell a story of why a user would perform it

• Critique the story by asking critical questions– Is the control for the desired action visible? Will a user see that the

control produces the desired effect? Will a user select a different

control instead? Will the action have the effect that the user intends?

Will a user understand the feedback and proceed correctly? What

happens if the user is wrong? Is there feedback to correct the error?

How would a user of <interface> react here?

Questions help see problems (focus, not a blindfold), but

time consuming → Streamlined CW technique with only

two questions

FOR EACH ACTION IN A TASK

• Easy to learn, to be performed early in the design process, or

be applied during any phase of dev.

- without first hand access to users

• Questions assumptions about what a user may be thinking

- unlike some usability inspection methods, takes explicit

account of the user's task

• Helps identify controls obvious to the designer but not a

user, difficulties with labels and prompts and inadequate

feedback

- quick and inexpensive to apply if done in a streamlined form

WALKTHROUGH PROS

30

• Diagnostic, not prescriptive

• Focuses mostly on novice users

• Designer's difficulty to put themselves in

user's mind

• Focus on task-related issues

• Interactions slower and not real

• Not providing quantitative results

Yet, a useful tool in conjunction with others

WALKTHROUGH CONS

• Compare and contrast Cognitive Walkthrough

with Heuristic Evaluation!

- Neither testing with real users,

nevertheless users’ point of view

- Uncover many usability problems

- Double expertise - both Human Computer

Interaction and the specific domain

- Multiple evaluators are best

- Inspections can be more thorough than

user testing

“Usa

bili

ty insp

ect

ion m

eth

ods

afte

r 15

year

s of re

sear

ch a

nd p

ract

ice”

(2007):

htt

ps:

//dig

ital

com

mons.

ute

p.e

du/c

gi/v

iew

con

tent.cg

i?re

fere

r=&

htt

psr

edir

=1&

articl

e=

100

9&

amp;c

onte

xt=

cs_pap

ers

DISCUSSION

https://digitalcommons.utep.edu/cgi/viewcontent.cgi?referer=&httpsredir=1&article=1009&context=cs_papers

• Software, compares interface with theguidelines

• Produce report and/or fix the code

• Manual check often needed: <alt> tag

(alternative for graphics) checked for content but not if

the text is appropriate – 'picture here'

• Number of fonts, average font size, deepestlevel of a menu, etc.

1371

AUTOMATED USABILITY TESTING

• Select representative users

• Select the setting

• Decide what tasks users should perform

• Decide what type of data to collect

• Before the test session (informed consent,

etc.)

• During the test session

• Debriefing after the session

Alma 2017-32

USER-BASED TESTING

WHEN TO TEST/STAGES

– Users perform tasks (early or later inthe development)

– Formative testing

• Low-level fidelity prototypes

• How the user perceives an interface component?

• Low-cost of paper prototypes, users comfortable to criticize

– Summative testing

• Evaluate the effectiveness of specific design choices

• High fidelity prototype

– Validation test

• Before release, compare to benchmarks

34

HOW MANY USERS?

– 5 users will find approximately 80% of problems

– 7 for small projects, 15 for medium-large projects

– Goal to find the major flaws , that will cause the

most problems and must be fixed

– How many users can we afford? How many users

can we get? How many users do we have time

for?

LOCATION

– Lab

– Portable usability lab

– Remote studies (time, place)

TASKS

• Clearly formulated; no explanation

• Tested

• One clear answer/way to solve

• Tasks that are performed often

+ critical tasks (logging in)

• No private, financial information

• Clear transition to next task

Oppgave x: Du er hjemme og du ser NRK Nett- TV pådin PC (http://www.nrk.no/nett-tv)

Se på episoden En uke i giftig avfall

av Reggie Yates og svar på følgende spørsmål:

• Hva heter den første av «brenneguttene» Reggie Yates treffer på Agbogbloshie?

• Hva viser seg å ikke være sant for britisk opprinnelse av salgsvarene Reggie og guttene finner?

EXAMPLE FOR CREATING TASKS

http://www.nrk.no/nett-tv

https://tv.nrk.no/serie/reggie-yates

– Task performance (Usability Metrics for

Effectiveness)

Completion Rate, Number of Errors

– Time performance (Usability Metrics for

Efficiency)

Time-Based Efficiency, Overall Relative Efficiency

– User satisfaction (validated survey tool) (Usability

Metrics for Satisfaction)

Task Level Satisfaction, Test Level Satisfaction (Qualitative

data)

– Additional:

– Average time to recover from an error

– Time spent using help

– Number of visits to the search feature

Utilizing logging

– Time spent on specific web page

– Mouse movements, typing speed

Key logging → ch. 12 Automated data collection (yellow

ed.), Eye tracking → ch. 13 Measuring the human

MEASUREMENT

TESTING SESSION

• Preparation

• The session

• After the session

• Example –A Lifesaving Device to Prevent Wandering (Usability Test):

https://www.youtube.com/watch?v=Cri6rFjUgc4

• High-fidelity prototype testing of Tripadvisor.com

https://www.youtube.com/watch?v=W17oYOhWIWE

https://www.youtube.com/watch?v=Cri6rFjUgc4

https://www.youtube.com/watch?v=W17oYOhWIWE

MAKING SENSE OF THE DATA

• Write up the results and help influence the

design of the specific interface

• Presentation to developers and managers

• Report should include

❑ All flaws

❑ Priorities

❑ For each flaw: description of the

problem, the data, priority, suggestions

for fix, estimated time/efforts for fix

• Report structure

❖ How you prepared & did UT

❖ What happened during the testing

❖ The implications and recomendations

QUESTIONS – PART I

Go to any interface (f.e. an app on

you phone)

• Difference between a summative and a

formative usability test of this interface

• Should we first test it with users or

with experts? Why?

• What are the differences between a

usability test and an experiment with

this app?

Interface usability test

• What is important to remind

participants before the usability test?

• What is the "think aloud" protocol?

Should it be used in summative or

formative studies?

• What are properties of good

tasks?

QUESTIONS – PART II

ASSIGNMENT 3, PART B

• a user-based usability test

• to evaluate a new interface

• allows people to track online their medical information,

such as blood tests, diagnostics, annual check-ups, and

patient visits

USABILITYVERSUS RESEARCH?

• UT similar to classic research approach

• But different goals

• Usability study – Research study - Research on

usability methods

• Historical roots of usability testing and experimental research and ethnography similar

• All part of usability testing: observation, key logging, click-stream analysis, quantitative measurement such as task and time performance, anonymity of participants


• Goals different

• Usability testing:

▪ An industry-based approach to improving interfaces

▪ Little concern for using only one approach or a theoretical point of view

▪ Building a successful product using the fewest resources, the fewest risks, in the shortest amount of time



• Practice different

• Usability testing:

▪ Practices, unacceptable in experimentaldesign: modifying the interface after every user(normal here)

▪ Not used to generalize (other interfaces,situations, or user groups)

▪ Often results kept confidential

A COMPARISON

Classical Research UsabilityTesting

Experimental design- isolate and

understand specific phenomena, with the

goal of generalization to other problems

Find and fix flaws in a specific

interface, no goal of generalization

Experimental design- a larger number of

participants is required

A small number of participants can be

utilized

Ethnography- observe to understand

the context of people, groups, and

organizations

Observe to understand where in the

interface users are having problems.

Ethnography- researcher participation is

encouraged

Researcher participation is not

encouraged in any way

General vs. specific

Nr. of paticipants

Focus on…

Researcher’s participation

Classical Research UsabilityTesting

Ethnography-longer-term research

methodShort-term research method

Ethnography or experimental design-

used to understand problems and/or

answer research questions

Used in systems and interface

development

Ethnography or experimental design- used

at earlier stages, often separate (or only

quasi-related) from the interface

development process

Typically takes place in later stages, after

interfaces (or prototype versions of

interfaces) have already been developed

Used for understanding problems Used for evaluating solutions

A COMPARISON

Time scale

Investigation vs.

development

Stage of R&D

Problems vs.

solutions

• Usability testing IS different from

traditional research

• also researchABOUT usability testing, with

research questions such as:

– Which methods are most effective?

– Which methods are most cost-effective?

– How many users are optimal?

RESEARCH ABOUT USABILITY TESTING

THAT WAS USABILITY TESTING

IN2020 Methods in Interaction Design, Autumn 2018

Chapter 10, presented by Ines Junge, PhD candidate

Documents

Usability - uio.no