Download pdf - Usability Testing for Survey Research:How to and Best Practices

Nov 9, 2016

QDET2 short course | Miami, FL

Usability Testing for Survey Research: Usability Testing for Survey Research: Usability Testing for Survey Research: Usability Testing for Survey Research: How To and Best PracticesHow To and Best PracticesHow To and Best PracticesHow To and Best Practices

Jen Romano-Bergstrom

Sr. UX Researcher

Facebook

[email protected]

@romanocog

Emily Geisen

Usability/Cog Lab Manager

Survey Methodologist

RTI

[email protected]

#QDET2

AgendaAgendaAgendaAgenda

Usability Testing and Survey Research

• What is usability and usability testing?

• Why do we need it in survey research?

• What to test and when

How to Conduct Usability Testing

• Planning

• Conducting sessions

• Analyzing results

2

2:00 – 3:45

4:00 – 5:30

#QDET2

Activity

• How long did it take you to get here?

• What is today’s date?

3

#QDET2

Why is Design Important?

4

#QDET2

Image source: Geisen & Romano Bergstrom, 2017

Is this position 0 or missing?

Why is Design Important?








• Planning



6

#QDET2

2:00 – 3:45

4:00 – 5:30

Usability definedUsability definedUsability definedUsability defined

7

“The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.”

-ISO 9241:11

#QDET2

Usability definedUsability definedUsability definedUsability defined

8

“The extent to which a product can be used by specified users to achieve specified goals with effectiveness, efficiency, and satisfaction in a specified context of use.”

-ISO 9241:11

#QDET2

9

• Product

• Users

• Goals

• Context

• Effectiveness, Efficiency, Satisfaction

What does usability mean for your survey?What does usability mean for your survey?What does usability mean for your survey?What does usability mean for your survey?

#QDET2

What does usability mean for surveys?What does usability mean for surveys?What does usability mean for surveys?What does usability mean for surveys?

10

Product Web sites, web surveys, paper surveys, apps

Users Our respondents (mostly), interviewers

Goals Respondents must be able to provide their correct

and accurate opinions, stories, facts, predictions

Context of use In their homes, offices, out and about

Effectiveness Completing the questions and survey with accurate

answers

Efficiency Completing the questions and survey quickly, with as

few steps/clicks as possible

Satisfaction Having a pleasant experience

#QDET2

11

Effectiveness: achieve a goalEfficiency: appropriate time /effort

Efficient + effective = useful

#QDET2

12

Satisfaction is more complex.

• Did it allow them to provide their accurate answers?

• Did they enjoy the experience?

• Did it require too much time to complete?

• Did they find the instrument easy to use?

• Did they find it easy to learn how to use?

#QDET2

13

Other factors may be important.

• Easy to remember how to use (memorability)

• Error frequency and severity

• Accessibility

• And most crucial for surveys

• Data quality

• Respondent burden

#QDET2

14

Usability testing is watching a user try to achieve the goal

• Participants represent real users

• Participants do real tasks

• You observe and record what participants do

• You think about what you saw:

• Analyze data,

• Diagnose problems,

• Recommend changes.

• Make changes and test again

#QDET2








• Planning



15

#QDET2

2:00 – 3:45

4:00 – 5:30

16

Surveys are not web sites.

…so why is usability testing

needed for survey research?

#QDET2

Surveys

• Improve data quality

• Reduce respondent burden

Websites

• Increase traffic/revenue (e.g., sell more product)

• Disseminate information

17

Usability Testing Goals

#QDET2

18

Users are not trained interviewers

• Web surveys go to the general (or specific) public

• Varying levels of computer expertise

• Varying levels of literacy

• Likely to be in a hurry, interrupted, distracted

• There is no interviewer

• No one to interpret the questions

• No one to navigate around the instrument

#QDET2

19

#QDET2

Presser et al 2004: pretesting focuses on a “broader concern for improving data quality so that measurementsmeet a survey’s objective”


20

Cognitive Testing vs Usability

• Cognitive Testing: Do people understand it?

• What are your feelings towards Obamacare? vs.

• What are your feelings towards the Affordable Care Act?

• Usability Testing: Can people use it?

#QDET2


Usability Model for Surveys

#QDET2

1. Interpreting the design: a. What meaning do respondents assign to visual design and layout?

b. How do respondents believe the survey works?

2. Completing actions and navigating: a. How well does the survey support respondents’ ability to

complete tasks and goals?

b. How well do respondents follow navigational cues and

instructions?

3. Processing feedback:a. How do respondents interpret and react to the survey feedback in

response to their actions?

b. How well does the survey help respondents identify, interpret, and

resolve errors?

Source: Geisen & Romano Bergstrom, 2017

22

Example Usability Study

Romano & Chen, 2011

#QDET2

23

Navigation Usability StudyMethod

• Lab-based usability study

• TA read introduction and left letter on desk

• Separate rooms

• R read letter and logged in to survey

• Think Aloud

• Eye Tracking

• Satisfaction Questionnaire

• Debriefing

* p < 0.0001

#QDET2

Romano & Chen, 2011

24

Navigation Usability StudyEye Tracking

Romano & Chen, 2011

• Participants looked at Previous and Next in PN conditions

• Many participants looked at Previous in the N_P conditions

• Couper et al. (2011): Previous gets used more when it is on the right.

#QDET2

25

Navigation Usability StudyDebriefing Interview

• N_P version

• Counterintuitive

• Don’t like the “buttons being flipped.”

• Next on the left is “really irritating.”

• Order is “opposite of what most people would design.”

• PN version

• “Pretty standard, like what you typically see.”

• The location is “logical.”

#QDET2

Romano & Chen, 2011

• “If you’re doing a web survey, you’re doing a mobile survey.” - Michael Link, 2013 AAPOR

• Respondents on mobile devices are as high as 30% or more for some surveys (Lugtig, Toepoel & Amin, 2016; Saunders, 2015).

26

Don’t forget about mobile

#QDET2

27

#QDET2

Romano Bergstrom, QDET2, 2016

Mobile Usability Study

V1: long list of items: grid on desktop; drop down to select

response on mobile

28

#QDET2

Mobile Usability Study

V1: long list of items; drop down to select response

V2: each question on

separate screensRomano Bergstrom, QDET2, 2016

Usability Testing Demo

29

#QDET2







• Planning



30

2:00 – 3:45

4:00 – 5:30

#QDET2

What can be tested?

31

#QDET2


Exploratory Testing Example

32

• Background

• GSS collects data about graduate students and postdocs in different fields of study

• NSF wanted to modify GSS to capture data at a more consistent and detailed level (e.g., programs instead of departments)

• The design approach

• Redesigned parts of survey using this model

• Conducted usability testing

#QDET2

Created a Hierarchy to Collect Data

33

• School/College: The Graduate school• Department: Biological Sciences

• Program: Cell Biology

• Provide counts of graduate students in cell biology by race/ethnicity, sex, etc

• Program: Botany

• Provide counts of graduate students in botany by race/ethnicity, sex, etc

• Department: Physics

• Program: Atmospheric physics

• Provide counts of graduate students in cell biology by race/ethnicity, sex, etc

#QDET2

Survey Framework Did Not Fit Users

34

• No common terminology

• What is a department vs program?

• Used other terminology altogether: division, concentration, track, field, subject

• No common hierarchy or structure

• Departments within programs and vice versa

• No departments, just programs

• Some departments had programs, some didn’t

• Information not available at level desired

#QDET2

Assessment & Verification:

35

#QDET2

• Low-fidelity prototypes

• Paper or computer mockups computer

• Wireframe

• High-fidelity prototypes

• Early interactive prototype / selected

interactive questions

• Finished product

Low-Fidelity Testing

36

• Methods: simple drawings or illustrations, Word/Excel/Visio, simple screen shots, website shell

• Uses: new questions or surveys, redesigns, evaluating information architecture, visual aspect of survey, web-centric features

• Benefits: allows for quick-feedback without spending too much time or money on programming, can apply results to other aspects of survey

• Don’t forget mobile testing (if using) at this stage!

#QDET2

What I can test: Paper Mock-Ups

37

#QDET2


38

#QDET2

What can I test: Mobile Paper Prototype

Image source: Craig, 2016; Geisen & Romano Bergstrom, 2017

What can I test: Computer mockup

39

#QDET2


What can I test: Low-Fidelity Wireframe

40

#QDET2

Romano Bergstrom et al., 2011

41

#QDET2

What can I test: High-Fidelity Wireframe

Romano Bergstrom et al., 2011

Design-based and iterative

42

#QDET2

Romano Bergstrom & Strohl, 2014

When to test

43

• Start as early as possible

• Testing should be integrated into the programming schedule, not conducted after

• Test in stages as web survey is being developed

• Test until all serious problems resolved / stop learning anything new (ideally)

• Iterative testing benefits from more rounds, fewer people

#QDET2

Reasons for more rounds, fewer people

44

• Identify more issues: 2 rounds of 5 users will likely identify more issues than 1 round of 10 users

• Diminishing returns from more users in each round

• Can be hard for users to see past the big glaring problems to other more subtle problems

• Allows you to test solution

• Good balance between testing resources and revision resources

• Quicker to summarize results and revise testing

#QDET2

Smaller rounds support collaboration

45

• Include stakeholders in testing

• Have programmers, clients, decision-makers observe testing live or remotely

• Direct observation is more exciting than reading a report

• Collaborative process

• Conduct tests in the morning, meet to discuss over a long lunch, recommendations for changes ready in the afternoon

• Report can then summarize findings and changes instead of findings and recommendations

#QDET2

Iterative Testing: Example

One box, prompt inside box

One box, prompt below box: resulted in

more complete names

Separate boxes, prompt below: even

more complete names

46

#QDET2

Geisen, Olmsted, Goerman & Lakhe, 2014

Iterative Testing: Example

One box, prompt inside box

One box, prompt below box: resulted in

more complete names

Separate boxes, prompt below: even

more complete names

47

#QDET2

Geisen, Olmsted, Goerman & Lakhe, 2014

Start with web & survey best practices

48

• User-centered evaluation includes best practices/findings from literature

• Abundance of literature

• Designing Effective Web Surveys (Couper)

• Internet, Mail, and Mixed-Mode Surveys (Dillman et al.)

• Jakob Nielsen

#QDET2

Build off the literature before doing usability testing

49

• Usability testing will show you how well or how easily people can do method A

• Will not necessarily show you that method A is definitively better than method B

• Not a replacement for large, probability-based methodological experiments

• Don’t reinvent the wheel

#QDET2

Building off the literature: Example

50

• Concern: Want to know best method for providing definitions in web surveys

• Ask: Has this been done before?

• Start with the literature:

• Conrad, Couper, Tourangeau, Peytchev (2006)

• Peytchev, Conrad, Couper, Tourangeau (2007)

• Peytchev, Conrad, Couper, Tourangeau (2010)

#QDET2

Building off the literature: Example (cont)

51

• Methods

• Experiment 1: one-click, two-clicks, click and scroll

• Experiment 2: roll-over, one-click, two-clicks

• Experiment 3: roll-over vs. always included

• Conclusions:

• Reading definitions probably improves accuracy

• Less effort required, more likely to read definitions

#QDET2

Example 2

52

• Methods

• Experiment 1: One-click, two-clicks, click and scroll

• Experiment 2: roll-over, one-click, two-clicks

• Experiment 3: hover-over vs always included

• Conclusions:

• Reading definitions probably improves accuracy

• Less effort required, more likely to read definitions

#QDET2

Start with the literature, but decide what’s relevant for your study

53

• The literature may not focus on the study population needed for your survey

• May not be any literature on the particular topic or issue your survey has

• And sometimes the experts just don’t agree,then what?

#QDET2







• Planning


• Reporting findings

54

#QDET2

2:00 – 3:45

4:00 – 5:30

Obstacles to Testing

55

• “There is no time.”

• Start early in development process.

• One morning a month with 3 users (Krug)

• 12 people in 3 days (Anderson Riemer)

• 12 people in 2 days (Lebson & Romano Bergstrom)

• “I can’t find representative users.”

• Everyone is important, and something is better than nothing.

• Remote testing or travel

• “We don’t have a lab.”

• You can test anywhere.

#QDET2

Planning

56

• Participant Selection and Recruitment

• Testing Location and Equipment

• Identifying Testing Focus/Concerns

• Identifying Measures to Collect

• Decide on Testing Roles

• Preparing Test Materials

#QDET2

Participants: determining target audience

57

• Recruit people who are like your target users

• Who is the survey for?

• Consider participants’ jobs and other roles

• Recruit diverse participants

• Age

• Income

• Education

• Is location important?

#QDET2

Participants: How many?

Adapted from: Nielsen and Landauer (1993)

#QDET2

Participant recruitment

59

• Existing Participant Lists (Existing Frame)

• Target respondents; small percentage of successful recruits

• No Participant Lists (Constructed Frame)

• Research firm with database

• Reliable; may be professional participants

• Hang fliers nearby

• Target locals; bit of work walking around

• Online social media ads

• Target specific criteria; social media users

• Classifieds

• Lots of responses quickly, non-Internet users; may be professional

• Snowball (word-of-mouth)

• Good for specific populations; They may know each other

#QDET2

Participant recruiting tips

60

• Recruit “floaters” (for no-shows and cancellations)

• Talk to your participants early

• Ask about specific behaviors relevant to your study (e.g. mobile usage, time spent online)

• Talk about what they’ll do and build rapport

• Get an email address and a contact number

• Schedule sessions ASAP (e.g., 3 weeks ahead)

• Remind them the day before

#QDET2

61

Location: Lab, Remote, In the Field

• Controlled environment

• All participants have the

same experience

• Record and communicate

from control room

• Observers watch from

control room and provide

additional probes (via

moderator) in real time

• Incorporate physiological

measures (e.g., eye

tracking, EDA)

• No travel costs

Laboratory Remote In the Field

• Participants tend to be

more comfortable in

their natural

environments

• Recruit hard-to-reach

populations (e.g.,

children, doctors)

• Moderator travels to

various locations

• Bring equipment (e.g.,

eye tracker)

• Natural observations

• Participants in their

natural environments

(e.g., home, work)

• Use video chat

(moderated sessions)

or online programs

(unmoderated)

• Conduct many

sessions quickly

• Recruit participants in

many locations (e.g.,

states, countries)

#QDET2

62

Lab-Based Usability Testing

#QDET2


63RTI Observation Room

#QDET2

64

Remote moderated Usability Testing

#QDET2


65

#AAPOR2016

Great for

quickly

assessing

different

designs –

gather large

amounts of

data quickly

0.00

0.10

0.20

0.30

0.40

0.50

0.60

Only Me Friends Confirm Change progress bar back on

phone

outside area

V1, N=2000

V2, N=1898

Pre-UX, N=864

First-click heat maps

Percentage of participants who made

the first click to these areas of interestRemote unmoderated testingexample

66

#QDET2

In-the-field usability testing


Testing location: summary

67

Your location / Facility User’s location

Pros • Use your equipment (e.g.,

one-way mirror, recording

software)

• Controlled setting with

no/few interruptions

• Simulates real user experience

• Users have access to info

• Easier to schedule/

accommodate users

Cons • Not true to real life

• More burden to user – more

no shows and cancelations

• Using your computer/device

not the user’s computer

• Need portable equipment or do

without

• Interviewer travel - increased

cost to researcher

• Safety matters

• Harder to schedule observers

#QDET2

Equipment: video/audio recording

68

• Helps with note-taking

• Reduces need to take notes or have a note-taker during interview

• Can take more nuanced notes afterwards if you can start and stop the video

• Helps after the interview

• Useful during debriefing to replay parts of video

• Accommodates observers who could not make it to the actual session

#QDET2

Equipment: Screen-sharing for observers

69

• Fosters collaboration

• Can accommodate observers from any location

• Facilitate discussions in conference setting

• Improved schedule

• Stakeholders get information immediately

• No waiting for recorded videos or report

• Cheaper

• Inexpensive compared to travel costs

• However, watching from their desks leads to less engagement than in-person – the best is always having observers in-person, in the observation room

#QDET2

Identify Testing Focus and Concerns

70

• General focus: Can they complete the survey?

• More specific concern: We are worried about the definitions.

• More specific: Are hover-overs an effective way of providing definitions?

• More specific: Do participants know definitions are available?

• More specific: Do participants understand what’s hover-overable?

• More specific: How helpful/unhelpful are the definitions?

#QDET2

More examples of specific concerns

71

• How well do people understand the instructions?

• Do people read the entire question and response options before responding? If not, what do they read?

• Can people use the Next and Previous navigation buttons correctly?

• Do people know what to do on each screen?

• How easily do people find the information they need to answer the questions?

• When people do not understand something or have a question, do they use the FAQs?

• Are the FAQs helpful/sufficient? What is missing?

• Are people able to correctly select their job from a long list of potential jobs?

• When do people use the left navigation, if at all?

• Can people use sliders correctly to select the desired response?

#QDET2

Identifying measures to collect

72

• Observational metrics tell us howhowhowhow participants navigate and interact.

• Self-report metrics tell us whywhywhywhy participants focus on certain site aspects.

• Eye tracking tells us what, how long, and how oftenwhat, how long, and how oftenwhat, how long, and how oftenwhat, how long, and how often participants focus on design elements.

• The combination of observational, self-report, and implicit data allows us to accurately measure

the user experience. We do not use eye tracking in isolation.

#QDET2

Include eye tracking?

73

Consider using eye tracking if you want to:Consider using eye tracking if you want to:Consider using eye tracking if you want to:Consider using eye tracking if you want to:

• Observe what attracts attention

• Discover potential areas of confusion/interest

• Watch as users learn to interact with an interface over time

• Validate/invalidate design changes

Usability Testing Usability Testing with Eye Tracking

Can users complete the survey

and individual items?

Do users see things that aid/hurt

completion?

• Direct observation of users’

behaviors

• Analysis: users’ conceptual

model vs. survey model

• Look patterns: locations, duration,

path

• Analysis: intended visual hierarchy

vs. actual look pattern

Evaluates usability Evaluates user experience

Supports improved ease of use Supports improved ease of use and

increased engagement

#QDET2

74

#QDET2

Eye tracking enables researchers to assess attention to motivational language and brand, which may impact response rate....

75Walton, Romano Bergstrom, Hawkins & Pierce, 2014

#QDET2

Eye tracking example: People read pages People read pages People read pages People read pages with with with with questions on them differently than other pagesquestions on them differently than other pagesquestions on them differently than other pagesquestions on them differently than other pages....

76Jarrett & Romano Bergstrom, 2014

The F-shaped eye-tracking pattern of the

block of text at the top of the page is

completely different from the eye-

tracking pattern on the question and

answer spaces at the bottom of the page.

#QDET2

Eye tracking example: People People People People dondondondon’’’’t read important t read important t read important t read important parts of survey invitation letters.parts of survey invitation letters.parts of survey invitation letters.parts of survey invitation letters.

77Olmsted-Hawala, Wang, Willimack, Burke & Lakhe, 2016

#QDET2

78

#AAPOR2016

When NOT to track

Jarrett & Romano Bergstrom, 2014

79

Slot-In Survey AnswersSlot-In Survey Answers

#QDET2


80

Gathered Survey

Answers

Gathered Survey

Answers

#QDET2


81

Created Survey

Answers

Created Survey

Answers

#QDET2


82

Third-Party Survey

Answers

Third-Party Survey

Answers

#QDET2


83

Include eye tracking? – Summary

#QDET2


Plan your measurements

84

• Examples of performance measures

• Success rate and/or speed for tasks

• Requests for help/assistance

• Number and types of errors that occurred (e.g., incorrect selections, menu choices)

• Count of features used (e.g., help menu, hover-over definitions, calculate button)

• Examples of preference measures

• Do you prefer A or B? Why?

• How or easy or difficult was it to do … Very easy, easy…

#QDET2

Organize roles

85

• Meet and greet

• Observers

• Test facilitator

• Note taker

• Videographer

#QDET2

Develop your test materials

86

• Develop consent forms, screeners

• Instructions/directions for participants

• Prepare written tasks/scenarios (on index cards)

• Pretest/posttest questionnaires

• Observer note sheets

#QDET2

Tasks, Scenarios, Probes

87

• Scenario – a real-life situation that you ask participants to put themselves in to test the instrument

• Task – something you want the participant to accomplish

• Probe – questions asked of the user to elicit additional information and feedback

#QDET2

A scenario brings the data together into a coherent story

88

• Keep scenarios short and simple

• Scenarios should reflect things participants might actually do

• Use vignettes to test rare/unusual situations

• Use the participant’s words, not researchers

• May need to prepare fake data to answer questions (e.g., SSN, phone number)

#QDET2

For some products, you need a task

89

• These are things that you want the user to do

• Often as simple as: “Please fill out this survey as you would at home”

• You may need specific tasks to match your test focus and concerns

#QDET2

90

Example 1 – Scenario

Romano Bergstrom, Childs, Olmsted-Hawala & Jurgenson, 2013

• Participants imagined they were at the respondent’s door

#QDET2

91


• Participants imagined they were at the respondent’s door

#QDET2


92


• To assess if the Information Sheet worked well, scripts were used to ensure interviewers could record difficult-to-record households.

#QDET2


Example 2 - Scenario and Task

Scenario: Your graduate school will include the following PhD programs this year:

• Biology

• Chemistry

• Marine, Earth, and Atmospheric Sciences

• Physics

• Spanish

Task: Please update the list of departments, programs, and research units that should be included in the survey for this year.

#QDET2

Don’t confuse participants with too many scenarios

95

• Whittle lists of tasks/scenarios to manageable number

• Prepare tasks to give to participants (index cards are useful)

• Tasks should flow in order of the survey

• Okay to change tasks between rounds

#QDET2







• Planning



96

#QDET2

2:00 – 3:45

4:00 – 5:30

Examples (Videos)

97

#QDET2

The day before the test

98

• Send out reminders

• Phone or email to respondents

• Email to stakeholders

• Equipment/Facility

• Check the computers and software (remote sharing, video recording), keyboard, mouse

• Make sure the room you’ll use is tidy

• Make sure your meet/greet person has the final list of participants’ names

• Incentives are available

#QDET2

The day before the test

99

#QDET2

Set-Up for Mobile

100

#QDET2


Set-Up for Mobile w Eye Tracking

101

Fors Marsh Group UX Lab Facebook UX Lab

#QDET2

Moderating Technique: Think Aloud

102

• Getting respondents to verbalize their thoughts

• Can be concurrent or retrospective

• Implementing

• Explain “thinking aloud” at the start

• Get the participant to try an example

• Remind them periodically (What are you thinking?)

• Snags:

• Thinking aloud is not natural for some people

• Others will start well, then forget

#QDET2

Think Aloud: concurrent vs retrospective

Concurrent

• Immediate thoughts (good recall)

• Procedural comments

• May affect task performance and usability metrics.

• Can interfere with eye-tracking data

• Shorter session length

• Less natural

Retrospective

• Relies on memory (recall failure)

• Explanatory comments

• No effect on task performance or usability metrics

• Accurate Eye-tracking data

• Session length increases

• More natural

103

#QDET2

Users may need help with thinking aloud

104

• Prompt as needed

• “What are you thinking?”

• “Tell me what you’re doing.

• “Tell me what you’re looking at.”

• “Keep talking.”

• “Tell me more about that.”

• Show you’re listening

• Be Patient

• Give reminders

#QDET2

Moderating Technique: Verbal Probing

105

• Ask targeted questions (probes) about content or functionality

• Explore content in more depth

• Concurrent vs retrospective

• Scripted vs spontaneous

#QDET2

Verbal Probing: Concurrent vs Retrospective

Concurrent

• Immediate thoughts (good recall) and more detail

• May be biased

• Affects task performance and usability metrics

• Ideal for exploratory tests and cognitive/usability combined tests

• Better for participants with low cognitive ability

Retrospective

• Relies on memory (recall failure), less detail

• Less biased

• No effect on task performance or usability metrics

• Can be used in any stage of testing

106

#QDET2

Verbal Probe Examples 1

Immediate thoughts or reactions

• What are your thoughts on this [screen]?

• What are you thinking?

• What are you doing?

• What are you looking at?

• What are you trying to do?

Does functionality match expectations?

• What do you expect to happen when you [click that link/button]?

• How did you expect that to work?

107

#QDET2

Verbal Probe Examples 2

Understand user

• What do you want to

accomplish?

• Can you describe the steps

you are taking now?

• How did you feel about

that process to [complete

task]?

• What’s going through your

mind right now?

Probing further

• Echoing

• Can you tell me more about that?

• Can you provide an example of [X]?

108

#QDET2

Probing Tips

109

• Avoid yes/no questions, people tend to be acquiescent

• Bad: “Was this task difficult to complete?”

• Good: “How easy or difficult was that task to complete?”

• Ask unbiased questions

• Bad: “Are you looking at the X link?”

• Good: “Can you tell me what are you looking at?”

• Be quiet and wait

• Bad: Impatiently asking “what’s happening?”

• Good: Count to 20 before jumping in. Or to 30.

#QDET2

When you hear yourself asking a leading question, balance it

110

Leading

question

“So you think

that’s difficult

then?”

Balanced

question

“...or was it

easy?”

#QDET2

Choosing a Moderating Technique

111

• Can the participant work completely alone?

• Will you need time on task and accuracy data?

• Are the tasks multi layered and/or require concentration?

• Will you be conducting eye tracking?

#QDET2

Moderating Techniques: Summary

112

#QDET2

Approach Advantages DisadvantagesConcurrent

Think

Aloud

• Feedback in real-time

• Good recall

• Procedural comments

• Shorter session length

• Unbiased feedback

• Easy for moderators to learn

• Slight effect on task

performance (vs. RTA)

• May affect usability

metrics.

• Some interference

with eye-tracking data

• Less natural

• Hard for some

participantsRetro-

spective

Think

Aloud

• Explanatory comments

• No effect on task performance

or usability metrics

• Accurate Eye-tracking data

• More natural

• Unbiased feedback

• Easy for moderators to learn

• Recall failure

• Longer session length

• Hard for some

participants

• Requires heavy cueing

Geisen & Romano Bergstrom, 2017

Moderating Techniques: Summary 2

113

#QDET2

Approach Advantages Disadvantages

Concurrent

Verbal

Probing

• Feedback in real-time

• Good recall

• Ask targeted questions

• More detailed comments

• Works well for exploratory tests,

cognitive/usability combined tests

• Easiest for participants, especially

with low cognitive ability

• May introduce bias

• Negative effect on task

performance and

usability metrics

• Hardest for moderators

to learn

• Longest session lengths

Retro-

spective

Verbal

Probing

• Less biased

• Ask targeted question

• No effect on task performance or

usability metrics

• Can be used in any stage of testing

• Easier for participants

• Recall failure

• Requires some cueing

• Less detailed

comments

• Hard for moderators to

learn (vs CTA, RTA)


Moderating Tips

114

• Maintain objective viewpoint

• Be prepared for surprises

• Report accurately

• Redirect participants to keep them on task

• Avoid coaching

• Don’t help participants

• Don’t ask if they would do “anything else”

• Don’t suggest: “Let’s try this”

#QDET2

Moderating Tips (Continued)

• Participant is silent

• Be patient. Then if necessary, ask “What are you thinking?”

• Participant asks you, “Is this right?”

• “We just want to see how you do it?”

• Participant asks you for help

• “What would you do if I wasn’t here?”

• Participant blames himself/herself

• “A lot of people have had this problem.”

• “Your feedback helps us learn what we need to improve.”

115

#QDET2

Provide neutral feedback

116

• Provide praise/feedback for every task, successful or unsuccessful

• Keep it neutral

• “mm hmmm:

• “uh huh”

• “That’s interesting”

• “that’s helpful”

• If you are writing notes then write down everything

• If you only write bad things, the participant will notice it’s biased

#QDET2







• Planning


• Analyzing Results

117

#QDET2

2:00 – 3:45

4:00 – 5:30

Analyzing Results

118

Analyze

• Collect all of your data together

• Summarize/Reduce to meaningful chunks

• Understanding what it means

Revise

• What can/should we do about it?

Test again

Barnum, 2011

#QDET2

Collect All of Your Data Together

119

• Self-reported

• Verbalizations

• Satisfaction and difficulty ratings from questionnaires

• Observational

• Usability metrics

• Click patterns

• Behavior and other observations

• Implicit:

• Eye-tracking data

#QDET2

120

“Focus ruthlessly on only the most serious problems”

Krug, 2010

#QDET2

Focus on the most serious problems

121

• Run your usability test

• Have a meeting with the key stakeholders

• Decide on most important problems (and problems that are easily fixed)

• Go away and fix them

• Ignore everything else

• Test again and repeat

#QDET2

Determining the problems to target

122

• Frequency of the problem (e.g., 5 out of 5 users)

• How likely are others to have this problem?

• What’s the impact on the survey (e.g., causes break-offs, inaccurate data)

• How much of the survey does it affect (e.g., local vs. global finding)?

• How easy/difficult is it to fix (low-hanging fruit)

#QDET2

Focus on findings that improve quality

123

• When usability testing a survey, focus on

• Improving data quality

• Reducing respondent burden

#QDET2

Determining what and how to fix problems is harder.

124

• Group debrief after the test to discuss

• Set priorities

• Most serious problems (ignore “nice to haves”)

• Problems that are easy to fix (e.g., typos, wording)

• How long will it take to fix?

• Will fixing it cause other potential problems?

• Recommendations should be specific/doable within timeframe and budget

• Not everything will get fixed

#QDET2

Weigh effect on data vs Effort to fix

125

#QDET2


With few users, each one really matters

126

• Are there outliers in small studies?

• How representative is each user?

• Are others likely to have this problem?

• Caveats for reporting data with few users

• Report numbers (4 out of 5) rather than percentages

• Report with numbers rather than words (most, usually, almost all)

• With a small number of users, you will find your biggest problems

• Iterate and/or conduct remote unmoderated testing, if possible

#QDET2

Q&A

127

Exa

mp

le C

on

sen

t Fo

rm

128

#AAPOR2016

129

#AAPOR2016

Exa

mp

le M

od

era

tor’

s G

uid

e

Exa

mp

le S

ati

sfa

ctio

n Q

ue

stio

nn

air

e

130

Exa

mp

le D

eb

rie

fin

g I

nte

rvie

w

131