Upload
others
View
4
Download
0
Embed Size (px)
Citation preview
USABILITY TESTING
IN2020 Methods in Interaction Design, Autumn 2018
Chapter 10, presented by Ines Junge, PhD candidate
OVERVIEW FOR TODAY
• Chapter 10What we test with, in usability testing
• Types of usability testing
• User-based testing
• Stages, amount of users, locations, tasks
(connections to ch.“Measuring the human” &
“Automated Data Collection”).
• Usability testing versus traditional research
WARM-UP EXERCISE
Two students together:
One student has forgotten a book on a bus and should report this to Ruter
The other student shall measure the time, evaluate the solution and find out if this was difficult…
WHAT IS USABILITY?
• "The extent to which a product can be used by
specified users to achieve specified goals with
effectiveness, efficiency, and satisfaction in a
specified context of use." (ISO 9241-11)
USABILITY DIMENSIONS
Effectiveness –Can users complete tasks, achieve goals with the product?
01Efficiency –How much effort (time) do users require to do this?
02Satisfaction –What do users think about the product's ease of use?
03
… WHICH ARE AFFECTED BY
The users: Who uses the product: Highly trained/ experienced users, or novices?
01Their goals: What users try to do with the product; inhowfarwhat is to do supported?
02The usage situation (or 'context of use’): Where and how the product is used
03
USABILITY CONTINUED
• Usability is composed of:– Learnability: How easy it is for users to accomplish
basic tasks the first time they encounter the design.
– Efficiency: Once users have learned the design,how quickly they can perform tasks.
– Memorability:When users return to the design aftera period of not using it, how easily they can re-establish proficiency.
– Errors: How many errors users make, how severe theseare, and how easily users can recover from the errors.
– Satisfaction: How pleasant it is to use the design.
(Nielsen,1993 )
WHAT IS USABILITY TESTING?
• Involving representative users, representative tasks,
representative environments
– Testing paper prototypes
Example Usability Test
with a Paper Prototypehttps://www.youtube.com/watch?v=9wQkLthhHKA
Picture credits: https://s3.amazonaws.com/digitalgov/_legacy-img/2014/01/Prototyping-workshop-presentataion.pdf
• Involving representative users, representative tasks,
representative environments
– Testing paper prototypes
– Wizard of Oz Wizard Of Oz Prototype
Walkthrough - Self-Service Checkout
Kiosk https://www.youtube.com/watch?v=d8SEFSpBYzY
WHAT IS USABILITY TESTING?
• Involving representative users, representative tasks,
representative environments
– Testing paper prototypes
– Wizard of Oz
– Testing working version of a system before it is
released
– Testing working system
– Testing on different devices (smart phones,
laptops…)
WHAT IS USABILITY TESTING?
• Improve the quality of an interface by finding flaws
in it
– cause problems for the majority of people
(not preferences)
• What works fine (keep it)
Flaws that cause problems for the majority of people
GOAL OF USABILITY TESTING?
10
https://public-media.interaction-
design.org/images/uploads/15606bb71fc51aac745d3
1fe2de6f8d1.gif
https://public-
media.interaction-
design.org/images/uploads/77
3821ff30b24bce0d471289139
64770.gif
WHAT IS “WRONG” HERE?
10
Flaws that cause problems for the majority of people
Think aloud with your classmate next to you and show
on the screen to the whole class in a mini-exhibition
afterwards! (5 min.)
Information overload? Mysterious navigation?
Added friction to user actions? “Clever” design
that ignores usability?
WHAT HAVE YOU ENCOUNTERED AS "WRONG"?
EXAMPLE
Alma 2016-55
• Yotaphone
• Settings of the data limit/1 GB warning
• Turnwheel with numbers (every single MB)
EXAMPLE
Alma 2016-55
• Yotaphone
• Settings of the data limit/1 GB warning
• Instead of turnwheel: Line for the limit movable in a graph
• Usability testing = general term:
– next to a total of 3 users, watching as they attempt
different tasks
– hundreds of users testing interfaces, in different
treatment groups
Reality: usually somewhere in between
• The best usability testing is the one that actually
takes place
– Be flexible, be reasonable, be practical
SCOPE OF USABILITY TESTING
Activities aiming to improve the ease
of use of an interface
• Expert-based testing
(usability inspection)
• Automated testing
(usability inspection)
• User-based testing
(usability testing)
USABILITY ENGINEERING
https://www.360logica.com/blog/effective-software-testing-tips-testers-go/
EXPERT-BASED TESTING
• Structured inspections done by interface experts
• Before tests with users
• Find!: Obvious flaws, confusing wording, inconsistent
layout etc.
❖ Heuristic review
Compare interface with the rules
❖ Consistency inspections
Series of screens or web pages inspected
❖ Cognitive walkthrough
Experts perform the tasks (high-frequency andimportant/seldom)
❖ Guidelines review
Web Content Accessibility Guidelines
HEURISTIC EVALUATION
• Enables designers to evaluate without users
• Economical technique to identify usability issues
early in the design process
➢ Briefing
…
➢ Evaluation period
…
➢ Debriefing
…
• teach to evaluators; ensure each person receives same briefing.
• become familiar with the UI and domain
Briefing
• compare UI against heuristics
• spend 1-2 hours with interface; minimal 2 interface passes
• take notes
Evaluation period
• Prioritize problems; rate severity
• aggregate results
• discuss outcomes with design/ development team
Debriefing
Jakob NielsenSatisfactionEfficiencyLearnabilityLow ErrorsMemorability
Ben ShneidermanEase of learningSpeed of task completionLow error rateRetention of knowledge over timeUser satisfaction
invented several usability methods,
including heuristic evaluation. He holds
79 United States patents, mainly on
ways of making the Internet easier to
use.
Jakob Nielsen has been called:
"the king of usability" (Internet
Magazine)
(https://www.nngroup.com/
people/jakob-nielsen/)
Good Example:
Ring/Silent switch
htt
ps:
//fa
culty.
was
hin
gton.e
du/jte
nenbg/
cours
es/
360/f0
4/s
ess
ions/
schneid
erm
anG
old
enR
ule
s.htm
l
1. Validity of system status
- Are users kept informed about what is going on?
- Is appropriate feedback provided within reasonable time about a user’s action?
2. Match between system and the real world
- Is the language used at the interface simple?
- Are the words, phrases and concepts used familiar to the user?
3. User control and freedom
- Are there ways of allowing users to easily escape from places they unexpectedly
find themselves in?
4. Consistency and standards
- Are the ways of performing similar actions consistent?
5. Help users recognize, diagnose, and recover from errors
- Are user messages helpful?
- Do they use plain language to describe the nature of the problem and suggest
a way of solving it?
6. Error prevention
- Is it easy to make errors?
- If so, where and why?
7. Recognition rather than recall
- Are objects, actions and options always visible?
8. Flexibility and efficiency of use
- Have accelerators (i.e. shortcuts) been provided that allow more experienced users to
carry out tasks more quickly?
9. Aesthetic and minimalist design
- Is any unnecessary and irrelevant information provided?
10. Help and documentation
- Is help information provided that can be easily searched and easily followed?
Ten U
sabili
ty H
euri
stic
s by
Jako
b N
iels
en
htt
p://w
ww
.use
it.c
om
/pap
ers
/heuri
stic
/heuri
stic
_lis
t.htm
l
htt
ps:
//tf
a.st
anfo
rd.e
du/d
ow
nlo
ad/T
enU
sabili
tyH
euri
stic
s.pdf
SEVERITY RATINGS
0 – this is not a usability problem
1 – cosmetic problem only
2 – minor usability problem
3 – major usability problem
4 – usability catastrophe; imperative to fix
Combination of frequency and impact
PROS AND CONS OF HE
• Pro
– Very cost effective
– Identifies many usability issues
• Contra
– relies on interpretation of guidelines
- tends to uncover many low-severity problems
– guidelines may be too generic
– needs more than one evaluator to be effective
- additional practical problem: multiple usability
experts more expensive and difficult to find as it
is to test 3-5 users
• Cognitive
walkthrough
involves simulating a
user’s problem-solving
process at each step in
the human- computer
dialog, checking to see
if the user’s goals and
memory for actions can
be assumed to lead to
the next correct action.
– Nielsen and
Mack, 1994
WALKTHROUGH BASICS
• Imagine how well a user could perform tasks
with your low-fidelity prototype
• Manipulate prototype as you go
– evaluate choice-points in the interface
– evaluate labels or options
– evaluate likely user navigation errors
• Revise prototype and perform again
WHY WE USE CW
• Cognitive walkthrough enables a designer to evaluate
an interface without users– a designer attempts to see the interface from the perspective of a user
• Low-investment technique to identify task-related
usability issues early on– no implementation or users required
– can be performed on existing interfaces
• Identify task-related problems before
implementation– invest a little now, save a lot later
• Enables rapid iteration early in design– can do several evaluations of trouble points
• Evaluations are only effective if your team– has the right skill set
– wants to improve the design, not defend it
WHY WE USE CW CONTINUED
Once one
• knows who the users are
• has a low-fidelity (paper) prototype of the
interface with enough “functionality” for
several tasks
• has task descriptions
• has scenarios designed to complete the
task
• has an evaluation team ready
WHEN TO DO/WHAT IS NEEDED
• Tell a story of why a user would perform it
• Critique the story by asking critical questions– Is the control for the desired action visible? Will a user see that the
control produces the desired effect? Will a user select a different
control instead? Will the action have the effect that the user intends?
Will a user understand the feedback and proceed correctly? What
happens if the user is wrong? Is there feedback to correct the error?
How would a user of <interface> react here?
Questions help see problems (focus, not a blindfold), but
time consuming → Streamlined CW technique with only
two questions
FOR EACH ACTION IN A TASK
• Easy to learn, to be performed early in the design process, or
be applied during any phase of dev.
- without first hand access to users
• Questions assumptions about what a user may be thinking
- unlike some usability inspection methods, takes explicit
account of the user's task
• Helps identify controls obvious to the designer but not a
user, difficulties with labels and prompts and inadequate
feedback
- quick and inexpensive to apply if done in a streamlined form
WALKTHROUGH PROS
30
• Diagnostic, not prescriptive
• Focuses mostly on novice users
• Designer's difficulty to put themselves in
user's mind
• Focus on task-related issues
• Interactions slower and not real
• Not providing quantitative results
Yet, a useful tool in conjunction with others
WALKTHROUGH CONS
• Compare and contrast Cognitive Walkthrough
with Heuristic Evaluation!
- Neither testing with real users,
nevertheless users’ point of view
- Uncover many usability problems
- Double expertise - both Human Computer
Interaction and the specific domain
- Multiple evaluators are best
- Inspections can be more thorough than
user testing
“Usa
bili
ty insp
ect
ion m
eth
ods
afte
r 15
year
s of re
sear
ch a
nd p
ract
ice”
(2007):
htt
ps:
//dig
ital
com
mons.
ute
p.e
du/c
gi/v
iew
con
tent.cg
i?re
fere
r=&
htt
psr
edir
=1&
articl
e=
100
9&
amp;c
onte
xt=
cs_pap
ers
DISCUSSION
• Software, compares interface with theguidelines
• Produce report and/or fix the code
• Manual check often needed: <alt> tag
(alternative for graphics) checked for content but not if
the text is appropriate – 'picture here'
• Number of fonts, average font size, deepestlevel of a menu, etc.
1371
AUTOMATED USABILITY TESTING
• Select representative users
• Select the setting
• Decide what tasks users should perform
• Decide what type of data to collect
• Before the test session (informed consent,
etc.)
• During the test session
• Debriefing after the session
Alma 2017-32
USER-BASED TESTING
WHEN TO TEST/STAGES
– Users perform tasks (early or later inthe development)
– Formative testing
• Low-level fidelity prototypes
• How the user perceives an interface component?
• Low-cost of paper prototypes, users comfortable to criticize
– Summative testing
• Evaluate the effectiveness of specific design choices
• High fidelity prototype
– Validation test
• Before release, compare to benchmarks
34
HOW MANY USERS?
– 5 users will find approximately 80% of problems
– 7 for small projects, 15 for medium-large projects
– Goal to find the major flaws , that will cause the
most problems and must be fixed
– How many users can we afford? How many users
can we get? How many users do we have time
for?
TASKS
• Clearly formulated; no explanation
• Tested
• One clear answer/way to solve
• Tasks that are performed often
+ critical tasks (logging in)
• No private, financial information
• Clear transition to next task
Oppgave x: Du er hjemme og du ser NRK Nett- TV pådin PC (http://www.nrk.no/nett-tv)
Se på episoden En uke i giftig avfall
av Reggie Yates og svar på følgende spørsmål:
• Hva heter den første av «brenneguttene» Reggie Yates treffer på Agbogbloshie?
• Hva viser seg å ikke være sant for britisk opprinnelse av salgsvarene Reggie og guttene finner?
EXAMPLE FOR CREATING TASKS
– Task performance (Usability Metrics for
Effectiveness)
Completion Rate, Number of Errors
– Time performance (Usability Metrics for
Efficiency)
Time-Based Efficiency, Overall Relative Efficiency
– User satisfaction (validated survey tool) (Usability
Metrics for Satisfaction)
Task Level Satisfaction, Test Level Satisfaction (Qualitative
data)
– Additional:
– Average time to recover from an error
– Time spent using help
– Number of visits to the search feature
Utilizing logging
– Time spent on specific web page
– Mouse movements, typing speed
Key logging → ch. 12 Automated data collection (yellow
ed.), Eye tracking → ch. 13 Measuring the human
MEASUREMENT
TESTING SESSION
• Preparation
• The session
• After the session
• Example –A Lifesaving Device to Prevent Wandering (Usability Test):
https://www.youtube.com/watch?v=Cri6rFjUgc4
• High-fidelity prototype testing of Tripadvisor.com
https://www.youtube.com/watch?v=W17oYOhWIWE
MAKING SENSE OF THE DATA
• Write up the results and help influence the
design of the specific interface
• Presentation to developers and managers
• Report should include
❑ All flaws
❑ Priorities
❑ For each flaw: description of the
problem, the data, priority, suggestions
for fix, estimated time/efforts for fix
• Report structure
❖ How you prepared & did UT
❖ What happened during the testing
❖ The implications and recomendations
QUESTIONS – PART I
Go to any interface (f.e. an app on
you phone)
• Difference between a summative and a
formative usability test of this interface
• Should we first test it with users or
with experts? Why?
• What are the differences between a
usability test and an experiment with
this app?
Interface usability test
• What is important to remind
participants before the usability test?
• What is the "think aloud" protocol?
Should it be used in summative or
formative studies?
• What are properties of good
tasks?
QUESTIONS – PART II
ASSIGNMENT 3, PART B
• a user-based usability test
• to evaluate a new interface
• allows people to track online their medical information,
such as blood tests, diagnostics, annual check-ups, and
patient visits
USABILITYVERSUS RESEARCH?
• UT similar to classic research approach
• But different goals
• Usability study – Research study - Research on
usability methods
• Historical roots of usability testing and experimental research and ethnography similar
• All part of usability testing: observation, key logging, click-stream analysis, quantitative measurement such as task and time performance, anonymity of participants
USABILITYVERSUS RESEARCH?
• Goals different
• Usability testing:
▪ An industry-based approach to improving interfaces
▪ Little concern for using only one approach or a theoretical point of view
▪ Building a successful product using the fewest resources, the fewest risks, in the shortest amount of time
USABILITYVERSUS RESEARCH?
USABILITYVERSUS RESEARCH?
• Practice different
• Usability testing:
▪ Practices, unacceptable in experimentaldesign: modifying the interface after every user(normal here)
▪ Not used to generalize (other interfaces,situations, or user groups)
▪ Often results kept confidential
A COMPARISON
Classical Research UsabilityTesting
Experimental design- isolate and
understand specific phenomena, with the
goal of generalization to other problems
Find and fix flaws in a specific
interface, no goal of generalization
Experimental design- a larger number of
participants is required
A small number of participants can be
utilized
Ethnography- observe to understand
the context of people, groups, and
organizations
Observe to understand where in the
interface users are having problems.
Ethnography- researcher participation is
encouraged
Researcher participation is not
encouraged in any way
General vs. specific
Nr. of paticipants
Focus on…
Researcher’s participation
Classical Research UsabilityTesting
Ethnography-longer-term research
methodShort-term research method
Ethnography or experimental design-
used to understand problems and/or
answer research questions
Used in systems and interface
development
Ethnography or experimental design- used
at earlier stages, often separate (or only
quasi-related) from the interface
development process
Typically takes place in later stages, after
interfaces (or prototype versions of
interfaces) have already been developed
Used for understanding problems Used for evaluating solutions
A COMPARISON
Time scale
Investigation vs.
development
Stage of R&D
Problems vs.
solutions
• Usability testing IS different from
traditional research
• also researchABOUT usability testing, with
research questions such as:
– Which methods are most effective?
– Which methods are most cost-effective?
– How many users are optimal?
RESEARCH ABOUT USABILITY TESTING