View
1.588
Download
3
Category
Preview:
DESCRIPTION
Dan Berlin, Jon Strohl, David Hawkins and I presented this at UXPA 2013. Eye tracking is well known and accepted in the UX community. Here we present preliminary evidence for the usefulness of adding electrodermal activity (EDA), continuous dial ratings, etc. to user experience research.
Citation preview
Beyond Eye Tracking Using user temperature, rating dials, and facial analysis to understand the user experience Jen Romano Bergstrom, Jon Strohl, David Hawkins Dan Berlin UXPA2013 | Washington, DC @romanocog @forsmarshgroup @banderlin
2
Client’s needs • Traditionally…
– What works well – What needs help
3
Client’s needs • Traditionally…
– What works well – What needs help
• Measure the UX
Observations
Selection/click behavior
Contextual observations
Time to complete task Reaction time
Accuracy Ability to complete tasks
4
Task efficiency and accuracy Accuracy
Steps to Complete Task*
Time to Complete Task*
Users 10% 8 170 seconds
Admins 21% 8.3 32 seconds
All Participants
15% 8.2 101 seconds
Session observations
5
• Observational click behavior • Facial expressions of frustration • Fidgeting and other observations of emotion
Areas of the website that participants explored first.
6
Explicit Post-task satisfaction questionnaires
Moderator follow up
In-session difficulty ratings
Verbal responses
Real-time +/- dial
Measure the UX by asking questions
Think aloud protocol
7
• Rooted in cognitive psychology and the study of thinking • Makes explicit what is implicitly present to participants • Concurrent vs. retrospective
“This is really confusing!”
Satisfaction questionnaires & difficulty ratings
8
• Assess users subjective satisfaction • Consistent questionnaire used across interfaces or
customized for its features and capabilities • Structured vs. unstructured
Satisfaction Questionnaire Please circle the numbers that most appropriately reflect your impressions about using this Web-based instrument.
terrible wonderful
1. Overall reaction to the Web site: 1 2 3 4 5 6 7 8 9 not applicable
confusing clear 2. Screen layouts: 1 2 3 4 5 6 7 8 9 not applicable
inconsistent consistent 3. Use of terminology throughout the Web site: 1 2 3 4 5 6 7 8 9 not applicable
inadequate adequate 4. Information displayed on the screens: 1 2 3 4 5 6 7 8 9 not applicable
illogical logical 5. Arrangement of information on the screen: 1 2 3 4 5 6 7 8 9 not applicable
never always 6. Tasks can be performed in a straight-forward manner: 1 2 3 4 5 6 7 8 9 not applicable
confusing clear 7. Organization of information on the site: 1 2 3 4 5 6 7 8 9 not applicable
impossible easy 8. Forward navigation: 1 2 3 4 5 6 7 8 9 not applicable
impossible easy 9. Backward navigation: 1 2 3 4 5 6 7 8 9 not applicable
difficult easy 10. Overall experience of finding information: 1 2 3 4 5 6 7 8 9 not applicable
too frequent appropriate 11. Census Bureau-specific terminology: 1 2 3 4 5 6 7 8 9 not applicable
12. Overall reaction to the Web site:
Terrible Wonderful 1 2 3 4 5 6 7
Frustrating Satisfying
1 2 3 4 5 6 7 Difficult Easy
1 2 3 4 5 6 7 13. Additional Comments (use the back of this paper if necessary):
9
Client’s needs • For this project…
– What grabs attention? – What is engaging? – What is a turn off? – What about the videos? – Good parts? Bad? – Is green better than…?
A volunteer please
10
11
Client’s needs • For this project…
– What grabs attention? – What is engaging? – What is a turn off? – What about the videos? – Good parts? Bad? – Is green better than…? Explicit
Post-task satisfaction questionnaires
Moderator follow up
In-session difficulty ratings
Verbal responses
Real-time +/- dial
Observations
Selection/click behavior
Contextual observations
Time to complete task Reaction time
Accuracy Ability to complete tasks
Implicit measures
12
• Physiological responses are difficult to control • Implicit responses are unfiltered • Responses occur before explicit measures
Definition: Underlying reactions (e.g., eye tracking, arousal) that people are unaware of, cannot control, or cannot express at a granular level
Stimulus Implicit Responses
Thought Processes
Explicit Responses
Why don’t we measure the implicit?
13
• Very difficult, if even possible, to communicate the subconscious.
• Responses occur in a very short time interval.
• A lot of noise in the signal
• Unfamiliar lexicon used in the literature.
• The technology is just beginning to become usable by a wider audience.
• Analyses appear overwhelmingly time consuming and complicated.
• It’s difficult to justify the ROI.
Why should we measure the implicit?
14
• Evaluates thought processes and emotions (not what the participant tells you)
• Quantifiable data that goes beyond task performance • Moment by moment interaction • Cause and effect triggers • Deeper insights
Why should we measure the implicit?
15
• Evaluates thought processes and emotions (not what the participant tells you)
• Quantifiable data that goes beyond task performance • Moment by moment interaction • Cause and effect triggers • Deeper insights
Traditional research is good at explaining what people say and do, not what they think and feel.
16
Observations Selection/click behavior
Ethnography
Time to complete task Reaction time
Accuracy
Ability to complete tasks
The Complete UX
Explicit Post-task satisfaction questionnaires
Moderator follow up
In-session difficulty ratings
Verbal responses
Real-time +/- dial
Implicit
Eye tracking Electrodermal activity (EDA)
Behavioral analysis
Pupil dilation
Facial expression coding
Implicit associations
Linguistic analysis of verbalizations
Heart rate variability
Two categories of implicit measures
17
Biometrics Neuroimaging
Neuroimaging metrics
18
• Indirectly or directly measures activity in the brain.
• Typically measures the hemodynamic response or brain electrical activity.
• Examine what “people are thinking”
Why don’t we collect neuroimaging measures?
19
• Lots of resources • Expensive equipment • Complex analyses • Strict protocols • Unnatural environment
Two categories of implicit measures
20
Biometrics Neuroimaging
Biometrics
21
• Established in UX research – Eye Tracking
• New to UX – Electrodermal Activity
• Skin conductance response • Body temperature
– Facial expression analysis – Pupil dilation – Heart rate variability – Respiration – Blood pressure
Eye Tracking
22
What is eye tracking
23
• Observing and recording eye movements as a participant interacts with a product – Allows us to gain deeper insight into how users
perform tasks
• Allows UX researchers to collect objective behavioral data
• Doesn’t include observing pupil dilation, blink rate, or facial recognition
Yesterday
Eye tracking today
24
Qualitative heat maps
25
• Aggregate of fixation count or duration across participants
Example: • Participants have similar fixation counts across links • Displays uncertainty of where to click to get started
Qualitative gaze plots
26
• Plot of fixations for a single participant Example:
• Participant fixates back and forth between two different sections
• Displays uncertainty on how to use the sections
• The instructional paragraph did not facilitate web reading
27
Example: • Participant has
repeated fixations in the upper right hand corner
• Participant said that he/she was looking for a search tool on the page
• The search tool was contained within a disappearing banner on the page
Qualitative gaze plots
Quantitative eye-tracking data
28
• Quantitative data – Attention
• Time to first fixation – Are users finding the important content quickly?
• Total number of fixations in an area of interest • Percentages of fixations in an AOI compared to the total page
– Are users spending an inordinate amount of time looking at a single area?
– Processing • Fixation duration
– Are users spending a long period of time in this area? – Efficiency
• Repeat fixations – Is information clear and presented efficiently?
Quantitative eye tracking
29
• Break the page up into separate “areas of interest” or AOIs
• Compare the fixation data between important areas and less important ones – Or compare data between
designs
Areas of Interest
Combining quantitative and qualitative data
30
• Using multiple sources of data makes the evidence more compelling
• Example: “LAUNCH” was expected to be the most clicked • Heat map supports the quantitative eye-tracking data
Beyond eye tracking
31
• Eye tracking is just one type of biometric measure • It tells us where participants are looking • It does not tell us
– Emotional state – Level of arousal – Level of mental workload
Facial expression analysis
32
33
Emotion Recognition Software • Real-time and continuous tracking of facial expressions
(Terzis, Moridis, Economides, 2010) • Distinguishes between happy, angry, sad, surprised, scared,
disgusted, and neutral – Overall accuracy of 89%
34
Emotion Recognition Software
35
Emotion Recognition Software
Bringing biometrics to UX research
36
Electrodermal Activity
37
What is it?
38
• Electrodermal activity (EDA) encompasses skin conductance responses and body temperature.
• Nerve fibers release sweat in response to a stimulus.
• Sweat facilitates the travel of an electrical signal.
• After a stimulus onset, glands return to a baseline status.
• Sweat secretion is related to sympathetic nervous system activity.
Who cares?
39
• Skin conductance is an established measure of arousal • Arousal can indicate engagement, fear, frustration, or other
emotional changes • Continuously measure changes in arousal throughout a test • Establish bench marks and use them to compare previous
iterations • Determine if the design facilitated typical levels of arousal
or if there were specific triggers
EDA in UX research
40
• EDA can indicate usability problems • Assess “good” and “bad” interfaces and compare biometrics (Ward
& Marsden, 2002) • “Bad” interface causes higher skin conductivity, lower blood
volume, and increased pulse rate • Assess frustration while playing a game (Lin and Hu, 2005)
41
How do I do it?
• The electrodes on an EDA sensor measure the resistance electricity faces when traveling across the skin.
• Electrodes can be placed on three locations – Best option - Palm – Good option - Finger – Acceptable option – Wrist
• Wired and wireless available
EDA recording device & analysis software
The device that required the least amount of training
42
A less commonly used explicit measure: Dial rating
43
Dial Rating
44
FMG Rating Dial
• Continuous real-time feedback on videos and commercials
• Researcher can choose anchors for the ratings • Tear dropped knob allows participant to remain
focused on the video • Time sensitive
Position of dial
Max position of dial
Min position of dial
Dial Recorder Software
Visa Video Ad
45
46
EDA data System Time Movement Data Temperature Raw EDA Signal Event Marker
47
• Tonic and phasic activity – Tonic activity is slow, state-based level of arousal – Phasic activity is a rapid, stimulus based change in arousal
• EDA activity is long periods of gradual change with a series of peaks in activity.
2.6
2.8
3.0
0 4 8 11 15 19 23 26 30
µS
Seconds
Processing the EDA signal
48
• The phasic response begins 1-4 seconds after onset of stimulus • The signal is analyzed in discrete time intervals • The area under the curve is analyzed to determine changes
2.6
2.8
3.0
0 4 8 11 15 19 23 26 30
µS
Seconds
Response onset Returning to baseline Response onset Peak is delayed
Analyzing EDA data
49
Traditional Measures of Attention and Emotion
50
P
I found my mind wandering while the
advertisement was on
While the advertisement was on, I found myself
thinking about other things
I had a hard time keeping my mind
on the advertisement
Average
P1 1 1 1 1.0
P2 1 2 1 1.3
P3 1 1 1 1.0
P4 3 3 3 3.0
P5 2 2 2 2.0
P6 2 2 2 2.0
Explicit rating of attention: Please indicate how much you agree with the following statements
Response options: 1 (Not at all) | 2 | 3 | 4 | 5 | 6 | 7 (Extremely)
51
Explicit rating of emotion: Please indicate how much you experienced each of the following while viewing the advertisement
P
Amused, fun-loving, silly
angry, irritated, or annoyed
disgust, distaste, or revulsion
guilty, repentant, or blameworthy
inspired, uplifted, or elevated
interested, alert, or curious
joyful, glad, or happy
sad, downhearted, or unhappy
scared, fearful, or afraid
sympathy, concern, or compassion
surprised, amazed, or astonished
P1 2 1 1 1 1 3 2 1 1 1 1
P2 2 3 1 1 1 1 1 1 1 1 1
P3 4 1 1 1 2 3 3 1 1 1 2
P4 1 2 1 1 1 1 1 1 1 1 1
P5 4 1 1 1 3 4 4 1 1 1 1
P6 5 1 1 1 3 4 4 1 1 1 2
Response options: 1 (Not at all) | 2 | 3 | 4 | 5 | 6 | 7 (Extremely)
52
• When? – When did minds start to wander? – When were people engaged?
• What? – What did people focus on? – What did people miss? – What caused the negative/positive emotions?
• Was it something specific or overall?
Unanswered Questions
53
New Measures of Attention and Emotion
54
Traditional Likert-Scale Overall Rating
New Continuous Dial Rating
Visa Video Ad Example Question: Please indicate how much you experienced each of the following while viewing the advertisement. Response options: Not At All | A little bit| Moderately | Quite a bit | Extremely
P amused, fun-loving, or silly
angry, irritated, or annoyed
disgust, distaste, or revulsion
guilty, repentant, or blameworthy
inspired, uplifted, or elevated
interested, alert, or curious
joyful, glad, or happy
sad, downhearted, or unhappy
scared, fearful, or afraid
sympathy, concern, or compassion
surprised, amazed, or astonished
P1 2 1 1 1 1 3 2 1 1 1 1
P2 2 3 1 1 1 1 1 1 1 1 1
P3 4 1 1 1 2 3 3 1 1 1 2
P4 1 2 1 1 1 1 1 1 1 1 1
P5 4 1 1 1 3 4 4 1 1 1 1
P6 5 1 1 1 3 4 4 1 1 1 2
-1.1
0.0
1.1 P1
P2
P3
P4
P5
P6
Mean
55
1.6 1.65
1.7 1.75
1.8 1.85
1.9 1.95
2 2.05
2.1
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Electrodermal Activity: Visa Video Ad
You can put notes here, but if you don’t it won’t appear when you present
[music only, screen change from bright to dark]
[drama<c screen change to black with white words, "without the worry of currency exchange"; music consistent]
[almost falls in water] [tail end of previous screen which appeared for several seconds and then change to first men<on of brand]
[middle of second screen change—MUSIC changes]
+ + + +
+
[music change]
[scene bright and beachy]
+
56
Traditional Likert-Scale Overall Rating
New Physiological Measure of Arousal
Visa Video Ad Example Question: Please indicate how much you experienced each of the following while viewing the advertisement. Response options: Not At All | A little bit| Moderately | Quite a bit | Extremely
P amused, fun-loving, or silly
angry, irritated, or annoyed
disgust, distaste, or revulsion
guilty, repentant, or blameworthy
inspired, uplifted, or elevated
interested, alert, or curious
joyful, glad, or happy
sad, downhearted, or unhappy
scared, fearful, or afraid
sympathy, concern, or compassion
surprised, amazed, or astonished
P1 2 1 1 1 1 3 2 1 1 1 1
P2 2 3 1 1 1 1 1 1 1 1 1
P3 4 1 1 1 2 3 3 1 1 1 2
P4 1 2 1 1 1 1 1 1 1 1 1
P5 4 1 1 1 3 4 4 1 1 1 1
P6 5 1 1 1 3 4 4 1 1 1 2
1.6
1.65
1.7
1.75
1.8
1.85
1.9
1.95
2
2.05
2.1
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
Artery Video Ad
57
Artery Video Ad Example: Traditional Measures
58
Traditional Likert-Scale Overall Rating
Question: Please indicate how much you experienced each of the following while viewing the advertisement. Response options: Not At All | A little bit| Moderately | Quite a bit | Extremely
P amused, fun-loving, or silly
angry, irritated, or annoyed
disgust, distaste, or revulsion
guilty, repentant, or blameworthy
inspired, uplifted, or elevated
interested, alert, or curious
joyful, glad, or happy
sad, downhearted, or unhappy
scared, fearful, or afraid
sympathy, concern, or compassion
surprised, amazed, or astonished
P1 1 1 2 1 1 1 1 1 1 1 1
P2 1 1 5 1 1 1 1 2 1 1 4
P3 3 1 3 1 1 2 1 1 1 3 3
P4 1 3 5 1 1 3 1 3 1 1 5
P5 1 1 3 1 1 3 1 2 1 1 1
P6 1 1 5 1 1 1 1 1 1 1 3
Artery video example
59
Traditional Likert-Scale Overall Rating
New Continuous Dial Rating
Question: Please indicate how much you experienced each of the following while viewing the advertisement. Response options: Not At All | A little bit| Moderately | Quite a bit | Extremely
P amused, fun-loving, or silly
angry, irritated, or annoyed
disgust, distaste, or revulsion
guilty, repentant, or blameworthy
inspired, uplifted, or elevated
interested, alert, or curious
joyful, glad, or happy
sad, downhearted, or unhappy
scared, fearful, or afraid
sympathy, concern, or compassion
surprised, amazed, or astonished
P1 1 1 2 1 1 1 1 1 1 1 1
P2 1 1 5 1 1 1 1 2 1 1 4
P3 3 1 3 1 1 2 1 1 1 3 3
P4 1 3 5 1 1 3 1 3 1 1 5
P5 1 1 3 1 1 3 1 2 1 1 1
P6 1 1 5 1 1 1 1 1 1 1 3
-‐1.2
-‐1
-‐0.8
-‐0.6
-‐0.4
-‐0.2
0
0.2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
P2, video 1
P3, video 1
P4, video 1
P5, video 1
P6, video 1
Mean
-‐1.2
-‐1
-‐0.8
-‐0.6
-‐0.4
-‐0.2
0
0.2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
P2, video 1
P3, video 1
P4, video 1
P5, video 1
P6, video 1
Mean
Continuous dial rating: Artery video
60
[sound of rushing air] "this much was found stuck to the aorta..."
"every cigareWe is doing you damage"
Electrodermal activity: Artery video
61
Traditional Likert-Scale Overall Rating
New Physiological Measure of Arousal
Question: Please indicate how much you experienced each of the following while viewing the advertisement. Response options: Not At All | A little bit| Moderately | Quite a bit | Extremely
P amused, fun-loving, or silly
angry, irritated, or annoyed
disgust, distaste, or revulsion
guilty, repentant, or blameworthy
inspired, uplifted, or elevated
interested, alert, or curious
joyful, glad, or happy
sad, downhearted, or unhappy
scared, fearful, or afraid
sympathy, concern, or compassion
surprised, amazed, or astonished
P1 1 1 2 1 1 1 1 1 1 1 1
P2 1 1 5 1 1 1 1 2 1 1 4
P3 3 1 3 1 1 2 1 1 1 3 3
P4 1 3 5 1 1 3 1 3 1 1 5
P5 1 1 3 1 1 3 1 2 1 1 1
P6 1 1 5 1 1 1 1 1 1 1 3
0.0
1.0
2.0
3.0
4.0
5.0
6.0
7.0
P1
P2
P3
P4
P5
P6
Mean
Electrodermal activity: Artery video
62
"...the main artery from the heart"
"every cigareWe is doing you damage"
[voice, pace change] "authorized by the Australian government"
"this much was found stuck to the aorta..."
[sound of rushing air] [first faWy deposits emerge]
+ + + + + +
“every cigareWe is doing you damage "
[sound effect; no text] “age 32“ [heartbeats] [sound of crackling embers]
+ + + +
1.6 1.65
1.7 1.75
1.8 1.85
1.9 1.95
2 2.05
2.1
-5 -4 -3 -2 -1 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32
EDA does not capture valence
63
You can put notes here, but if you don’t it won’t appear when you present
P1: Artery ad (Negative emotion)
P1: Visa ad (Positive emotion)
Continuous Dial Rating: Artery vs. Visa
64
-1.1
0.0
1.1 P1
P2
P3
P4
P5
P6
Mean
-1.2
-1
-0.8
-0.6
-0.4
-0.2
0
0.2
0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30
P2, video 1
P3, video 1
P4, video 1
P5, video 1
P6, video 1
Mean
EDA advantages and disadvantages
65
• Advantages – Continuous measure of
automatic physiological response
– Sensitive to minor changes in arousal
– Informs order of magnitude • Disadvantages
– Does not inform valence – Peak of physiological response
is slow – Sometimes difficult to collect
0
0.5
1
1.5
2
2.5
Dial Eye Tracker EDA
Mea
n In
trusi
vene
ss R
atin
g
Debriefing question: On a scale of 1 to 5, how intrusive was ____ while you were trying to complete the tasks and watch videos?
Dial: Two participants rated the dial as very intrusive (4): “I was having to concentrate on what my reaction was, not just have it.”
“It’s not something I normally do, or something I do consciously.” EDA: Three participants rated the wrist band as moderately intrusive (3): “It was itchy.” “I had to remember not to move it.” “I didn’t know where to put it.”
The future of implicit measures
66
We need to be taking a collaborative approach
67
• Disparate measures of physiological response can tell a cohesive story! • By analyzing different streams of data we can uncover a very rich level
of analysis.
We need to be taking a collaborative approach
68
Combining implicit measures for meaningful insights
69
-1.100
0.000
1.100 • Simulated pupil diameter data
• Simulated heart rate variability data
• Simulated EDA data
EDA: promising future
70
• Promising results – When data is good, EDA provides continuous, “objective” arousal
measure – There is consistency between:
• The Likert scale and the continuous dial data • Self-reported emotion overall and EDA data
– EDA provides additional data above and beyond self-report measures
– Most complete story can be told with a combination of measures.
71
• Data Analyses – Compare to baseline – different baseline per person and per stimulus – How does pupil dilation data compare with EDA? – Reduce the intrusiveness ratings for all metrics
Lessons learned
• Dial – If ET is not used, allow participants to look at the dial when making
responses – Include simple practice task to increase familiarity
• Eye Tracker – Instruct participants to visually search as if they were at home on their own
computer
• EDA – Improve quality of EDA data; explore equipment – Provide a cushion/pad to rest arm – Over-recruit
Select your measure carefully
72
• Where are participants dwelling on instructions and tasks? – Eye tracking
• Which specific elements on a page are particularly stressful? – Eye tracking, EDA
• Which content is very engaging for the user? – Eye tracking, EDA, satisfaction questions, debriefing interview
• Which design causes more stress on the user? – EDA, debriefing interview
Not just about usability but also interaction
73
Interfaces that adjust based on affective state and workload
74
Video games that adapt to a user’s experience
75
Cognitive training programs that adjust to a person’s ability
76
But for UX…
77
Pushing our research further
78
• There are lessons to be learned from neuromarketing – Neuromarketing researchers have used EDA, heart rate
variability and even fMRI and EEG in an attempt to determine how users experience an advertisement.
• UX has a different set of requirements – To become more usable for practitioners, we need:
• Portable technology that can be taken when traveling • Software that has a short learning curve • Customizations that allow for sensors to be wrist mounted and
more literature to substantiate the use of this sensor location • Analysis protocols that can be completed in a short period of
time.
Issues to keep in mind
79
• We want to mimic real-world experiences during a usability study • Complex setup will confound our experimental design • Participant comfort is paramount
• Concurrent think-aloud vs. Retrospective think-aloud • A talking participant is a distracted participant
• We always need to provide support for a ROI
Where do we go from here?
80
• We need to: – Collaborate to move our
field forward – Share methods and
analysis protocols – Empirically test our
hypotheses – Continually provide proof
for ROI
Thank you!
81
Jennifer Romano Bergstrom jbergstrom@forsmarshgroup.com | @romanocog
Dan Berlin
dberlin@madpow.net | @banderlin
Jon Strohl jstrohl@forsmarshgroup.com | @jonstrohl
David Hawkins
dhawkins@forsmarshgroup.com | @dHawk87 UXPA2013 | Washington, DC
Recommended