Nov 9, 2016
QDET2 short course | Miami, FL
Usability Testing for Survey Research: Usability Testing for Survey Research: Usability Testing for Survey Research: Usability Testing for Survey Research: How To and Best PracticesHow To and Best PracticesHow To and Best PracticesHow To and Best Practices
Jen Romano-Bergstrom
Sr. UX Researcher
@romanocog
Emily Geisen
Usability/Cog Lab Manager
Survey Methodologist
RTI
#QDET2
AgendaAgendaAgendaAgenda
Usability Testing and Survey Research
• What is usability and usability testing?
• Why do we need it in survey research?
• What to test and when
How to Conduct Usability Testing
• Planning
• Conducting sessions
• Analyzing results
2
2:00 – 3:45
4:00 – 5:30
#QDET2
Activity
• How long did it take you to get here?
• What is today’s date?
3
#QDET2
Why is Design Important?
4
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
Is this position 0 or missing?
Why is Design Important?
Image source: Geisen & Romano Bergstrom, 2017
AgendaAgendaAgendaAgenda
Usability Testing and Survey Research
• What is usability and usability testing?
• Why do we need it in survey research?
• What to test and when
How to Conduct Usability Testing
• Planning
• Conducting sessions
• Analyzing results
6
#QDET2
2:00 – 3:45
4:00 – 5:30
Usability definedUsability definedUsability definedUsability defined
7
“The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.”
-ISO 9241:11
#QDET2
Usability definedUsability definedUsability definedUsability defined
8
“The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.”
-ISO 9241:11
#QDET2
9
• Product
• Users
• Goals
• Context
• Effectiveness, Efficiency, Satisfaction
What does usability mean for your survey?What does usability mean for your survey?What does usability mean for your survey?What does usability mean for your survey?
#QDET2
What does usability mean for surveys?What does usability mean for surveys?What does usability mean for surveys?What does usability mean for surveys?
10
Product Web sites, web surveys, paper surveys, apps
Users Our respondents (mostly), interviewers
Goals Respondents must be able to provide their correct
and accurate opinions, stories, facts, predictions
Context of use In their homes, offices, out and about
Effectiveness Completing the questions and survey with accurate
answers
Efficiency Completing the questions and survey quickly, with as
few steps/clicks as possible
Satisfaction Having a pleasant experience
#QDET2
11
Effectiveness: achieve a goalEfficiency: appropriate time /effort
Efficient + effective = useful
#QDET2
12
Satisfaction is more complex.
• Did it allow them to provide their accurate answers?
• Did they enjoy the experience?
• Did it require too much time to complete?
• Did they find the instrument easy to use?
• Did they find it easy to learn how to use?
#QDET2
13
Other factors may be important.
• Easy to remember how to use (memorability)
• Error frequency and severity
• Accessibility
• And most crucial for surveys
• Data quality
• Respondent burden
#QDET2
14
Usability testing is watching a user try to achieve the goal
• Participants represent real users
• Participants do real tasks
• You observe and record what participants do
• You think about what you saw:
• Analyze data,
• Diagnose problems,
• Recommend changes.
• Make changes and test again
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
AgendaAgendaAgendaAgenda
Usability Testing and Survey Research
• What is usability and usability testing?
• Why do we need it in survey research?
• What to test and when
How to Conduct Usability Testing
• Planning
• Conducting sessions
• Analyzing results
15
#QDET2
2:00 – 3:45
4:00 – 5:30
16
Surveys are not web sites.
…so why is usability testing
needed for survey research?
#QDET2
Surveys
• Improve data quality
• Reduce respondent burden
Websites
• Increase traffic/revenue (e.g., sell more product)
• Disseminate information
17
Usability Testing Goals
#QDET2
18
Users are not trained interviewers
• Web surveys go to the general (or specific) public
• Varying levels of computer expertise
• Varying levels of literacy
• Likely to be in a hurry, interrupted, distracted
• There is no interviewer
• No one to interpret the questions
• No one to navigate around the instrument
#QDET2
19
#QDET2
Presser et al 2004: pretesting focuses on a “broader concern for improving data quality so that measurementsmeet a survey’s objective”
Image source: Geisen & Romano Bergstrom, 2017
20
Cognitive Testing vs Usability
• Cognitive Testing: Do people understand it?
• What are your feelings towards Obamacare? vs.
• What are your feelings towards the Affordable Care Act?
• Usability Testing: Can people use it?
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
Usability Model for Surveys
#QDET2
1. Interpreting the design: a. What meaning do respondents assign to visual design and layout?
b. How do respondents believe the survey works?
2. Completing actions and navigating: a. How well does the survey support respondents’ ability to
complete tasks and goals?
b. How well do respondents follow navigational cues and
instructions?
3. Processing feedback:a. How do respondents interpret and react to the survey feedback in
response to their actions?
b. How well does the survey help respondents identify, interpret, and
resolve errors?
Source: Geisen & Romano Bergstrom, 2017
22
Example Usability Study
Romano & Chen, 2011
#QDET2
23
Navigation Usability StudyMethod
• Lab-based usability study
• TA read introduction and left letter on desk
• Separate rooms
• R read letter and logged in to survey
• Think Aloud
• Eye Tracking
• Satisfaction Questionnaire
• Debriefing
* p < 0.0001
#QDET2
Romano & Chen, 2011
24
Navigation Usability StudyEye Tracking
Romano & Chen, 2011
• Participants looked at Previous and Next in PN conditions
• Many participants looked at Previous in the N_P conditions
• Couper et al. (2011): Previous gets used more when it is on the right.
#QDET2
25
Navigation Usability StudyDebriefing Interview
• N_P version
• Counterintuitive
• Don’t like the “buttons being flipped.”
• Next on the left is “really irritating.”
• Order is “opposite of what most people would design.”
• PN version
• “Pretty standard, like what you typically see.”
• The location is “logical.”
#QDET2
Romano & Chen, 2011
• “If you’re doing a web survey, you’re doing a mobile survey.” - Michael Link, 2013 AAPOR
• Respondents on mobile devices are as high as 30% or more for some surveys (Lugtig, Toepoel & Amin, 2016; Saunders, 2015).
26
Don’t forget about mobile
#QDET2
27
#QDET2
Romano Bergstrom, QDET2, 2016
Mobile Usability Study
V1: long list of items: grid on desktop; drop down to select
response on mobile
28
#QDET2
Mobile Usability Study
V1: long list of items; drop down to select response
V2: each question on
separate screensRomano Bergstrom, QDET2, 2016
Usability Testing Demo
29
#QDET2
AgendaAgendaAgendaAgenda
Usability Testing and Survey Research
• What is usability and usability testing?
• Why do we need it in survey research?
• What to test and when
How to Conduct Usability Testing
• Planning
• Conducting sessions
• Analyzing results
30
2:00 – 3:45
4:00 – 5:30
#QDET2
What can be tested?
31
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
Exploratory Testing Example
32
• Background
• GSS collects data about graduate students and postdocs in different fields of study
• NSF wanted to modify GSS to capture data at a more consistent and detailed level (e.g., programs instead of departments)
• The design approach
• Redesigned parts of survey using this model
• Conducted usability testing
#QDET2
Created a Hierarchy to Collect Data
33
• School/College: The Graduate school• Department: Biological Sciences
• Program: Cell Biology
• Provide counts of graduate students in cell biology by race/ethnicity, sex, etc
• Program: Botany
• Provide counts of graduate students in botany by race/ethnicity, sex, etc
• Department: Physics
• Program: Atmospheric physics
• Provide counts of graduate students in cell biology by race/ethnicity, sex, etc
#QDET2
Survey Framework Did Not Fit Users
34
• No common terminology
• What is a department vs program?
• Used other terminology altogether: division, concentration, track, field, subject
• No common hierarchy or structure
• Departments within programs and vice versa
• No departments, just programs
• Some departments had programs, some didn’t
• Information not available at level desired
#QDET2
Assessment & Verification:
35
#QDET2
• Low-fidelity prototypes
• Paper or computer mockups computer
• Wireframe
• High-fidelity prototypes
• Early interactive prototype / selected
interactive questions
• Finished product
Low-Fidelity Testing
36
• Methods: simple drawings or illustrations, Word/Excel/Visio, simple screen shots, website shell
• Uses: new questions or surveys, redesigns, evaluating information architecture, visual aspect of survey, web-centric features
• Benefits: allows for quick-feedback without spending too much time or money on programming, can apply results to other aspects of survey
• Don’t forget mobile testing (if using) at this stage!
#QDET2
What I can test: Paper Mock-Ups
37
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
38
#QDET2
What can I test: Mobile Paper Prototype
Image source: Craig, 2016; Geisen & Romano Bergstrom, 2017
What can I test: Computer mockup
39
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
What can I test: Low-Fidelity Wireframe
40
#QDET2
Romano Bergstrom et al., 2011
41
#QDET2
What can I test: High-Fidelity Wireframe
Romano Bergstrom et al., 2011
Design-based and iterative
42
#QDET2
Romano Bergstrom & Strohl, 2014
When to test
43
• Start as early as possible
• Testing should be integrated into the programming schedule, not conducted after
• Test in stages as web survey is being developed
• Test until all serious problems resolved / stop learning anything new (ideally)
• Iterative testing benefits from more rounds, fewer people
#QDET2
Reasons for more rounds, fewer people
44
• Identify more issues: 2 rounds of 5 users will likely identify more issues than 1 round of 10 users
• Diminishing returns from more users in each round
• Can be hard for users to see past the big glaring problems to other more subtle problems
• Allows you to test solution
• Good balance between testing resources and revision resources
• Quicker to summarize results and revise testing
#QDET2
Smaller rounds support collaboration
45
• Include stakeholders in testing
• Have programmers, clients, decision-makers observe testing live or remotely
• Direct observation is more exciting than reading a report
• Collaborative process
• Conduct tests in the morning, meet to discuss over a long lunch, recommendations for changes ready in the afternoon
• Report can then summarize findings and changes instead of findings and recommendations
#QDET2
Iterative Testing: Example
One box, prompt inside box
One box, prompt below box: resulted in
more complete names
Separate boxes, prompt below: even
more complete names
46
#QDET2
Geisen, Olmsted, Goerman & Lakhe, 2014
Iterative Testing: Example
One box, prompt inside box
One box, prompt below box: resulted in
more complete names
Separate boxes, prompt below: even
more complete names
47
#QDET2
Geisen, Olmsted, Goerman & Lakhe, 2014
Start with web & survey best practices
48
• User-centered evaluation includes best practices/findings from literature
• Abundance of literature
• Designing Effective Web Surveys (Couper)
• Internet, Mail, and Mixed-Mode Surveys (Dillman et al.)
• Jakob Nielsen
#QDET2
Build off the literature before doing usability testing
49
• Usability testing will show you how well or how easily people can do method A
• Will not necessarily show you that method A is definitively better than method B
• Not a replacement for large, probability-based methodological experiments
• Don’t reinvent the wheel
#QDET2
Building off the literature: Example
50
• Concern: Want to know best method for providing definitions in web surveys
• Ask: Has this been done before?
• Start with the literature:
• Conrad, Couper, Tourangeau, Peytchev (2006)
• Peytchev, Conrad, Couper, Tourangeau (2007)
• Peytchev, Conrad, Couper, Tourangeau (2010)
#QDET2
Building off the literature: Example (cont)
51
• Methods
• Experiment 1: one-click, two-clicks, click and scroll
• Experiment 2: roll-over, one-click, two-clicks
• Experiment 3: roll-over vs. always included
• Conclusions:
• Reading definitions probably improves accuracy
• Less effort required, more likely to read definitions
#QDET2
Example 2
52
• Methods
• Experiment 1: One-click, two-clicks, click and scroll
• Experiment 2: roll-over, one-click, two-clicks
• Experiment 3: hover-over vs always included
• Conclusions:
• Reading definitions probably improves accuracy
• Less effort required, more likely to read definitions
#QDET2
Start with the literature, but decide what’s relevant for your study
53
• The literature may not focus on the study population needed for your survey
• May not be any literature on the particular topic or issue your survey has
• And sometimes the experts just don’t agree,then what?
#QDET2
AgendaAgendaAgendaAgenda
Usability Testing and Survey Research
• What is usability and usability testing?
• Why do we need it in survey research?
• What to test and when
How to Conduct Usability Testing
• Planning
• Conducting sessions
• Reporting findings
54
#QDET2
2:00 – 3:45
4:00 – 5:30
Obstacles to Testing
55
• “There is no time.”
• Start early in development process.
• One morning a month with 3 users (Krug)
• 12 people in 3 days (Anderson Riemer)
• 12 people in 2 days (Lebson & Romano Bergstrom)
• “I can’t find representative users.”
• Everyone is important, and something is better than nothing.
• Remote testing or travel
• “We don’t have a lab.”
• You can test anywhere.
#QDET2
Planning
56
• Participant Selection and Recruitment
• Testing Location and Equipment
• Identifying Testing Focus/Concerns
• Identifying Measures to Collect
• Decide on Testing Roles
• Preparing Test Materials
#QDET2
Participants: determining target audience
57
• Recruit people who are like your target users
• Who is the survey for?
• Consider participants’ jobs and other roles
• Recruit diverse participants
• Age
• Income
• Education
• Is location important?
#QDET2
Participants: How many?
Adapted from: Nielsen and Landauer (1993)
#QDET2
Participant recruitment
59
• Existing Participant Lists (Existing Frame)
• Target respondents; small percentage of successful recruits
• No Participant Lists (Constructed Frame)
• Research firm with database
• Reliable; may be professional participants
• Hang fliers nearby
• Target locals; bit of work walking around
• Online social media ads
• Target specific criteria; social media users
• Classifieds
• Lots of responses quickly, non-Internet users; may be professional
• Snowball (word-of-mouth)
• Good for specific populations; They may know each other
#QDET2
Participant recruiting tips
60
• Recruit “floaters” (for no-shows and cancellations)
• Talk to your participants early
• Ask about specific behaviors relevant to your study (e.g. mobile usage, time spent online)
• Talk about what they’ll do and build rapport
• Get an email address and a contact number
• Schedule sessions ASAP (e.g., 3 weeks ahead)
• Remind them the day before
#QDET2
61
Location: Lab, Remote, In the Field
• Controlled environment
• All participants have the
same experience
• Record and communicate
from control room
• Observers watch from
control room and provide
additional probes (via
moderator) in real time
• Incorporate physiological
measures (e.g., eye
tracking, EDA)
• No travel costs
Laboratory Remote In the Field
• Participants tend to be
more comfortable in
their natural
environments
• Recruit hard-to-reach
populations (e.g.,
children, doctors)
• Moderator travels to
various locations
• Bring equipment (e.g.,
eye tracker)
• Natural observations
• Participants in their
natural environments
(e.g., home, work)
• Use video chat
(moderated sessions)
or online programs
(unmoderated)
• Conduct many
sessions quickly
• Recruit participants in
many locations (e.g.,
states, countries)
#QDET2
62
Lab-Based Usability Testing
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
63RTI Observation Room
#QDET2
64
Remote moderated Usability Testing
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
65
#AAPOR2016
Great for
quickly
assessing
different
designs –
gather large
amounts of
data quickly
0.00
0.10
0.20
0.30
0.40
0.50
0.60
Only Me Friends Confirm Change progress bar back on
phone
outside area
V1, N=2000
V2, N=1898
Pre-UX, N=864
First-click heat maps
Percentage of participants who made
the first click to these areas of interestRemote unmoderated testingexample
66
#QDET2
In-the-field usability testing
Image source: Geisen & Romano Bergstrom, 2017
Testing location: summary
67
Your location / Facility User’s location
Pros • Use your equipment (e.g.,
one-way mirror, recording
software)
• Controlled setting with
no/few interruptions
• Simulates real user experience
• Users have access to info
• Easier to schedule/
accommodate users
Cons • Not true to real life
• More burden to user – more
no shows and cancelations
• Using your computer/device
not the user’s computer
• Need portable equipment or do
without
• Interviewer travel - increased
cost to researcher
• Safety matters
• Harder to schedule observers
#QDET2
Equipment: video/audio recording
68
• Helps with note-taking
• Reduces need to take notes or have a note-taker during interview
• Can take more nuanced notes afterwards if you can start and stop the video
• Helps after the interview
• Useful during debriefing to replay parts of video
• Accommodates observers who could not make it to the actual session
#QDET2
Equipment: Screen-sharing for observers
69
• Fosters collaboration
• Can accommodate observers from any location
• Facilitate discussions in conference setting
• Improved schedule
• Stakeholders get information immediately
• No waiting for recorded videos or report
• Cheaper
• Inexpensive compared to travel costs
• However, watching from their desks leads to less engagement than in-person – the best is always having observers in-person, in the observation room
#QDET2
Identify Testing Focus and Concerns
70
• General focus: Can they complete the survey?
• More specific concern: We are worried about the definitions.
• More specific: Are hover-overs an effective way of providing definitions?
• More specific: Do participants know definitions are available?
• More specific: Do participants understand what’s hover-overable?
• More specific: How helpful/unhelpful are the definitions?
#QDET2
More examples of specific concerns
71
• How well do people understand the instructions?
• Do people read the entire question and response options before responding? If not, what do they read?
• Can people use the Next and Previous navigation buttons correctly?
• Do people know what to do on each screen?
• How easily do people find the information they need to answer the questions?
• When people do not understand something or have a question, do they use the FAQs?
• Are the FAQs helpful/sufficient? What is missing?
• Are people able to correctly select their job from a long list of potential jobs?
• When do people use the left navigation, if at all?
• Can people use sliders correctly to select the desired response?
#QDET2
Identifying measures to collect
72
• Observational metrics tell us howhowhowhow participants navigate and interact.
• Self-report metrics tell us whywhywhywhy participants focus on certain site aspects.
• Eye tracking tells us what, how long, and how oftenwhat, how long, and how oftenwhat, how long, and how oftenwhat, how long, and how often participants focus on design elements.
• The combination of observational, self-report, and implicit data allows us to accurately measure
the user experience. We do not use eye tracking in isolation.
#QDET2
Include eye tracking?
73
Consider using eye tracking if you want to:Consider using eye tracking if you want to:Consider using eye tracking if you want to:Consider using eye tracking if you want to:
• Observe what attracts attention
• Discover potential areas of confusion/interest
• Watch as users learn to interact with an interface over time
• Validate/invalidate design changes
Usability Testing Usability Testing with Eye Tracking
Can users complete the survey
and individual items?
Do users see things that aid/hurt
completion?
• Direct observation of users’
behaviors
• Analysis: users’ conceptual
model vs. survey model
• Look patterns: locations, duration,
path
• Analysis: intended visual hierarchy
vs. actual look pattern
Evaluates usability Evaluates user experience
Supports improved ease of use Supports improved ease of use and
increased engagement
#QDET2
74
#QDET2
Eye tracking enables researchers to assess attention to motivational language and brand, which may impact response rate....
75Walton, Romano Bergstrom, Hawkins & Pierce, 2014
#QDET2
Eye tracking example: People read pages People read pages People read pages People read pages with with with with questions on them differently than other pagesquestions on them differently than other pagesquestions on them differently than other pagesquestions on them differently than other pages....
76Jarrett & Romano Bergstrom, 2014
The F-shaped eye-tracking pattern of the
block of text at the top of the page is
completely different from the eye-
tracking pattern on the question and
answer spaces at the bottom of the page.
#QDET2
Eye tracking example: People People People People dondondondon’’’’t read important t read important t read important t read important parts of survey invitation letters.parts of survey invitation letters.parts of survey invitation letters.parts of survey invitation letters.
77Olmsted-Hawala, Wang, Willimack, Burke & Lakhe, 2016
#QDET2
78
#AAPOR2016
When NOT to track
Jarrett & Romano Bergstrom, 2014
79
Slot-In Survey AnswersSlot-In Survey Answers
#QDET2
Jarrett & Romano Bergstrom, 2014
80
Gathered Survey
Answers
Gathered Survey
Answers
#QDET2
Jarrett & Romano Bergstrom, 2014
81
Created Survey
Answers
Created Survey
Answers
#QDET2
Jarrett & Romano Bergstrom, 2014
82
Third-Party Survey
Answers
Third-Party Survey
Answers
#QDET2
Jarrett & Romano Bergstrom, 2014
83
Include eye tracking? – Summary
#QDET2
Jarrett & Romano Bergstrom, 2014
Plan your measurements
84
• Examples of performance measures
• Success rate and/or speed for tasks
• Requests for help/assistance
• Number and types of errors that occurred (e.g., incorrect selections, menu choices)
• Count of features used (e.g., help menu, hover-over definitions, calculate button)
• Examples of preference measures
• Do you prefer A or B? Why?
• How or easy or difficult was it to do … Very easy, easy…
#QDET2
Organize roles
85
• Meet and greet
• Observers
• Test facilitator
• Note taker
• Videographer
#QDET2
Develop your test materials
86
• Develop consent forms, screeners
• Instructions/directions for participants
• Prepare written tasks/scenarios (on index cards)
• Pretest/posttest questionnaires
• Observer note sheets
#QDET2
Tasks, Scenarios, Probes
87
• Scenario – a real-life situation that you ask participants to put themselves in to test the instrument
• Task – something you want the participant to accomplish
• Probe – questions asked of the user to elicit additional information and feedback
#QDET2
A scenario brings the data together into a coherent story
88
• Keep scenarios short and simple
• Scenarios should reflect things participants might actually do
• Use vignettes to test rare/unusual situations
• Use the participant’s words, not researchers
• May need to prepare fake data to answer questions (e.g., SSN, phone number)
#QDET2
For some products, you need a task
89
• These are things that you want the user to do
• Often as simple as: “Please fill out this survey as you would at home”
• You may need specific tasks to match your test focus and concerns
#QDET2
90
Example 1 – Scenario
Romano Bergstrom, Childs, Olmsted-Hawala & Jurgenson, 2013
• Participants imagined they were at the respondent’s door
#QDET2
91
Example 2 – Scenario
• Participants imagined they were at the respondent’s door
#QDET2
Romano Bergstrom, Childs, Olmsted-Hawala & Jurgenson, 2013
92
Example 2 – Scenario
• To assess if the Information Sheet worked well, scripts were used to ensure interviewers could record difficult-to-record households.
#QDET2
Romano Bergstrom, Childs, Olmsted-Hawala & Jurgenson, 2013
Example 2 - Scenario and Task
Scenario: Your graduate school will include the following PhD programs this year:
• Biology
• Chemistry
• Marine, Earth, and Atmospheric Sciences
• Physics
• Spanish
Task: Please update the list of departments, programs, and research units that should be included in the survey for this year.
#QDET2
Don’t confuse participants with too many scenarios
95
• Whittle lists of tasks/scenarios to manageable number
• Prepare tasks to give to participants (index cards are useful)
• Tasks should flow in order of the survey
• Okay to change tasks between rounds
#QDET2
AgendaAgendaAgendaAgenda
Usability Testing and Survey Research
• What is usability and usability testing?
• Why do we need it in survey research?
• What to test and when
How to Conduct Usability Testing
• Planning
• Conducting sessions
• Analyzing results
96
#QDET2
2:00 – 3:45
4:00 – 5:30
Examples (Videos)
97
#QDET2
The day before the test
98
• Send out reminders
• Phone or email to respondents
• Email to stakeholders
• Equipment/Facility
• Check the computers and software (remote sharing, video recording), keyboard, mouse
• Make sure the room you’ll use is tidy
• Make sure your meet/greet person has the final list of participants’ names
• Incentives are available
#QDET2
The day before the test
99
#QDET2
Set-Up for Mobile
100
#QDET2
Image source: Geisen & Romano Bergstrom, 2017
Set-Up for Mobile w Eye Tracking
101
Fors Marsh Group UX Lab Facebook UX Lab
#QDET2
Moderating Technique: Think Aloud
102
• Getting respondents to verbalize their thoughts
• Can be concurrent or retrospective
• Implementing
• Explain “thinking aloud” at the start
• Get the participant to try an example
• Remind them periodically (What are you thinking?)
• Snags:
• Thinking aloud is not natural for some people
• Others will start well, then forget
#QDET2
Think Aloud: concurrent vs retrospective
Concurrent
• Immediate thoughts (good recall)
• Procedural comments
• May affect task performance and usability metrics.
• Can interfere with eye-tracking data
• Shorter session length
• Less natural
Retrospective
• Relies on memory (recall failure)
• Explanatory comments
• No effect on task performance or usability metrics
• Accurate Eye-tracking data
• Session length increases
• More natural
103
#QDET2
Users may need help with thinking aloud
104
• Prompt as needed
• “What are you thinking?”
• “Tell me what you’re doing.
• “Tell me what you’re looking at.”
• “Keep talking.”
• “Tell me more about that.”
• Show you’re listening
• Be Patient
• Give reminders
#QDET2
Moderating Technique: Verbal Probing
105
• Ask targeted questions (probes) about content or functionality
• Explore content in more depth
• Concurrent vs retrospective
• Scripted vs spontaneous
#QDET2
Verbal Probing: Concurrent vs Retrospective
Concurrent
• Immediate thoughts (good recall) and more detail
• May be biased
• Affects task performance and usability metrics
• Ideal for exploratory tests and cognitive/usability combined tests
• Better for participants with low cognitive ability
Retrospective
• Relies on memory (recall failure), less detail
• Less biased
• No effect on task performance or usability metrics
• Can be used in any stage of testing
106
#QDET2
Verbal Probe Examples 1
Immediate thoughts or reactions
• What are your thoughts on this [screen]?
• What are you thinking?
• What are you doing?
• What are you looking at?
• What are you trying to do?
Does functionality match expectations?
• What do you expect to happen when you [click that link/button]?
• How did you expect that to work?
107
#QDET2
Verbal Probe Examples 2
Understand user
• What do you want to
accomplish?
• Can you describe the steps
you are taking now?
• How did you feel about
that process to [complete
task]?
• What’s going through your
mind right now?
Probing further
• Echoing
• Can you tell me more about that?
• Can you provide an example of [X]?
108
#QDET2
Probing Tips
109
• Avoid yes/no questions, people tend to be acquiescent
• Bad: “Was this task difficult to complete?”
• Good: “How easy or difficult was that task to complete?”
• Ask unbiased questions
• Bad: “Are you looking at the X link?”
• Good: “Can you tell me what are you looking at?”
• Be quiet and wait
• Bad: Impatiently asking “what’s happening?”
• Good: Count to 20 before jumping in. Or to 30.
#QDET2
When you hear yourself asking a leading question, balance it
110
Leading
question
“So you think
that’s difficult
then?”
Balanced
question
“...or was it
easy?”
#QDET2
Choosing a Moderating Technique
111
• Can the participant work completely alone?
• Will you need time on task and accuracy data?
• Are the tasks multi layered and/or require concentration?
• Will you be conducting eye tracking?
#QDET2
Moderating Techniques: Summary
112
#QDET2
Approach Advantages DisadvantagesConcurrent
Think
Aloud
• Feedback in real-time
• Good recall
• Procedural comments
• Shorter session length
• Unbiased feedback
• Easy for moderators to learn
• Slight effect on task
performance (vs. RTA)
• May affect usability
metrics.
• Some interference
with eye-tracking data
• Less natural
• Hard for some
participantsRetro-
spective
Think
Aloud
• Explanatory comments
• No effect on task performance
or usability metrics
• Accurate Eye-tracking data
• More natural
• Unbiased feedback
• Easy for moderators to learn
• Recall failure
• Longer session length
• Hard for some
participants
• Requires heavy cueing
Geisen & Romano Bergstrom, 2017
Moderating Techniques: Summary 2
113
#QDET2
Approach Advantages Disadvantages
Concurrent
Verbal
Probing
• Feedback in real-time
• Good recall
• Ask targeted questions
• More detailed comments
• Works well for exploratory tests,
cognitive/usability combined tests
• Easiest for participants, especially
with low cognitive ability
• May introduce bias
• Negative effect on task
performance and
usability metrics
• Hardest for moderators
to learn
• Longest session lengths
Retro-
spective
Verbal
Probing
• Less biased
• Ask targeted question
• No effect on task performance or
usability metrics
• Can be used in any stage of testing
• Easier for participants
• Recall failure
• Requires some cueing
• Less detailed
comments
• Hard for moderators to
learn (vs CTA, RTA)
Geisen & Romano Bergstrom, 2017
Moderating Tips
114
• Maintain objective viewpoint
• Be prepared for surprises
• Report accurately
• Redirect participants to keep them on task
• Avoid coaching
• Don’t help participants
• Don’t ask if they would do “anything else”
• Don’t suggest: “Let’s try this”
#QDET2
Moderating Tips (Continued)
• Participant is silent
• Be patient. Then if necessary, ask “What are you thinking?”
• Participant asks you, “Is this right?”
• “We just want to see how you do it?”
• Participant asks you for help
• “What would you do if I wasn’t here?”
• Participant blames himself/herself
• “A lot of people have had this problem.”
• “Your feedback helps us learn what we need to improve.”
115
#QDET2
Provide neutral feedback
116
• Provide praise/feedback for every task, successful or unsuccessful
• Keep it neutral
• “mm hmmm:
• “uh huh”
• “That’s interesting”
• “that’s helpful”
• If you are writing notes then write down everything
• If you only write bad things, the participant will notice it’s biased
#QDET2
AgendaAgendaAgendaAgenda
Usability Testing and Survey Research
• What is usability and usability testing?
• Why do we need it in survey research?
• What to test and when
How to Conduct Usability Testing
• Planning
• Conducting sessions
• Analyzing Results
117
#QDET2
2:00 – 3:45
4:00 – 5:30
Analyzing Results
118
Analyze
• Collect all of your data together
• Summarize/Reduce to meaningful chunks
• Understanding what it means
Revise
• What can/should we do about it?
Test again
Barnum, 2011
#QDET2
Collect All of Your Data Together
119
• Self-reported
• Verbalizations
• Satisfaction and difficulty ratings from questionnaires
• Observational
• Usability metrics
• Click patterns
• Behavior and other observations
• Implicit:
• Eye-tracking data
#QDET2
120
“Focus ruthlessly on only the most serious problems”
Krug, 2010
#QDET2
Focus on the most serious problems
121
• Run your usability test
• Have a meeting with the key stakeholders
• Decide on most important problems (and problems that are easily fixed)
• Go away and fix them
• Ignore everything else
• Test again and repeat
#QDET2
Determining the problems to target
122
• Frequency of the problem (e.g., 5 out of 5 users)
• How likely are others to have this problem?
• What’s the impact on the survey (e.g., causes break-offs, inaccurate data)
• How much of the survey does it affect (e.g., local vs. global finding)?
• How easy/difficult is it to fix (low-hanging fruit)
#QDET2
Focus on findings that improve quality
123
• When usability testing a survey, focus on
• Improving data quality
• Reducing respondent burden
#QDET2
Determining what and how to fix problems is harder.
124
• Group debrief after the test to discuss
• Set priorities
• Most serious problems (ignore “nice to haves”)
• Problems that are easy to fix (e.g., typos, wording)
• How long will it take to fix?
• Will fixing it cause other potential problems?
• Recommendations should be specific/doable within timeframe and budget
• Not everything will get fixed
#QDET2
Weigh effect on data vs Effort to fix
125
#QDET2
Geisen & Romano Bergstrom, 2017
With few users, each one really matters
126
• Are there outliers in small studies?
• How representative is each user?
• Are others likely to have this problem?
• Caveats for reporting data with few users
• Report numbers (4 out of 5) rather than percentages
• Report with numbers rather than words (most, usually, almost all)
• With a small number of users, you will find your biggest problems
• Iterate and/or conduct remote unmoderated testing, if possible
#QDET2
Q&A
127
Exa
mp
le C
on
sen
t Fo
rm
128
#AAPOR2016
129
#AAPOR2016
Exa
mp
le M
od
era
tor’
s G
uid
e
Exa
mp
le S
ati
sfa
ctio
n Q
ue
stio
nn
air
e
130
Exa
mp
le D
eb
rie
fin
g I
nte
rvie
w
131