Upload
bart-knijnenburg
View
108
Download
3
Tags:
Embed Size (px)
DESCRIPTION
A talk I gave at the Netflix offices on July 2nd, 2012. Please do not use any of the slides or their contents without my explicit permission ([email protected] for inquiries).
Citation preview
The User Experienceof Recommender Systems
IntroductionWhat I want to do today
INFORMATION AND COMPUTER SCIENCES
IntroductionBart Knijnenburg
- Current- Informatics PhD candidate,
UC Irvine- Research intern,
Samsung R&D
- Past- Researcher & teacher,
TU Eindhoven- MSc Human Technology
Interaction, TU Eindhoven- M Human-Computer
Interaction, Carnegie Mellon
INFORMATION AND COMPUTER SCIENCES
IntroductionBart Knijnenburg
- UMUAI paper- Explaining the User Experience
of Recommender Systems (first UX framework in RecSys)
- Founder of UCERSTI- workshop on user-centric
evaluation of recommender systems and their interfaces
- this year: tutorial on user experiments
INFORMATION AND COMPUTER SCIENCES
IntroductionMy goal:
Introduce my framework for user-centric evaluation of recommender systemsExplain how the framework can help in conducting user experiments
My approach:
- Start with “beyond the five stars”
- Explain how to improve upon this
- Introduce a 21st century approach to user experiments
INFORMATION AND COMPUTER SCIENCES
Introduction
“What is a user experiment?”
“A user experiment is a scientific method to investigate how and why system aspects
influence the users’ experience and behavior.”
IntroductionWhat I want to do today
Beyond the 5 starsThe Netflix approach
Towards user experienceHow to improve upon the Netflix approach
Evaluation frameworkA framework for user-centric evaluation
MeasurementMeasuring subjective valuations
AnalysisStatistical evaluation of results
Some examplesFindings based on the framework
Beyond the 5 starsThe Netflix approach
INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars
Offline evaluations may not give the same outcome as online evaluations
Cosley et al., 2002; McNee et al., 2002
Solution: Test with real users
INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars
Systemalgorithm
Interactionrating
INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars
Higher accuracy does not always mean higher satisfactionMcNee et al., 2006
Solution: Consider other behaviors“Beyond the five stars”
INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars
Systemalgorithm
Interaction
rating
consumption
retention
INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars
The algorithm counts for only 5% of the relevance of a recommender system
Francisco Martin - RecSys 2009 keynote
Solution: test those other aspects“Beyond the five stars”
INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars
System
algorithm
interaction
presentation
Interaction
rating
consumption
retention
INFORMATION AND COMPUTER SCIENCES
Beyond the 5 starsStart with a hypothesis
Algorithm/feature/design X will increase member engagement and ultimately member retention
Design a testDevelop a solution or prototypeThink about dependent & independent variables, control, significance…
Execute the test
INFORMATION AND COMPUTER SCIENCES
Beyond the 5 stars
Let data speak for itselfMany different metricsWe ultimately trust member engagement (e.g. hours of play) and retention
Is this good enough?I think it is not
Towards user experienceHow to improve upon the Netflix approach
INFORMATION AND COMPUTER SCIENCES
User experience
“Testing a recommender against a static videoclip system, the number of clicked clips
and total viewing time went down!”
INFORMATION AND COMPUTER SCIENCES
perceived recommendation quality
SSA
perceived system effectiveness
EXP
personalized
recommendationsOSA
number of clips watched from beginning
to end totalviewing time
number of clips clicked+
++
+
− −
choicesatisfaction
EXP
User experience
Knijnenburg et al.: “Receiving Recommendations and Providing Feedback”, EC-Web 2010
INFORMATION AND COMPUTER SCIENCES
User experience
Behavior is hard to interpretRelationship between behavior and satisfaction is not always trivial
User experience is a better predictor of long-term retentionWith behavior only, you will need to run for a long time
Questionnaire data is more robust Fewer participants needed
INFORMATION AND COMPUTER SCIENCES
User experience
Measure subjective valuations with questionnairesPerception and experience
Triangulate these data with behaviorFind out how they mediate the effect on behavior
Measure every step in your theoryCreate a chain of mediating variables
INFORMATION AND COMPUTER SCIENCES
User experience
Perception: do users notice the manipulation?
- Understandability, perceived control
- Perceived recommendation diversity and quality
Experience: self-relevant evaluations of the system
- Process-related (e.g. choice difficulty)
- System-related (e.g. system effectiveness)
- Outcome related (e.g. choice satisfaction)
INFORMATION AND COMPUTER SCIENCES
User experience
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
INFORMATION AND COMPUTER SCIENCES
User experience
Personal and situational characteristics may have an important impact
Adomavicius et al., 2005; Knijnenburg et al., 2012
Solution: measure those as well
INFORMATION AND COMPUTER SCIENCES
User experience
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
Evaluation frameworkA framework for user-centric evaluation
INFORMATION AND COMPUTER SCIENCES
FrameworkFramework for user-centric evaluation of recommenders
Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012
Systemalgorithm
interaction
presentation
Perceptionusability
quality
appeal
Experiencesystem
process
outcome
Interactionrating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
INFORMATION AND COMPUTER SCIENCES
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
Framework
Objective System Aspects (OSA)
- visual / interaction design
- recommender algorithm
- presentation of recommendations
- additional features
INFORMATION AND COMPUTER SCIENCES
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
FrameworkUser Experience (EXP)
Different aspects may influence different things
- interface -> system evaluations
- preference elicitation method -> choice process
- algorithm -> quality of the final choice
INFORMATION AND COMPUTER SCIENCES
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
Framework
Perception or Subjective
System Aspects (SSA) Link OSA to EXP (mediation)Increase the robustness of the effects of OSAs on EXPHow and why OSAs affect EXP
INFORMATION AND COMPUTER SCIENCES
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
Personal characteristics (PC)
Users’ general dispositionBeyond the influence of the systemTypically stable across tasks
Framework
INFORMATION AND COMPUTER SCIENCES
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
Framework
Situational Characteristics (SC)
How the task influences the experienceAlso beyond the influence of the systemHere used for evaluation, not for augmenting algorithms!
INFORMATION AND COMPUTER SCIENCES
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
Framework
Interaction (INT)
Observable behavior
- browsing, viewing, log-ins
Final step of evaluationBut not the goal of the evaluationTriangulate with EXP
INFORMATION AND COMPUTER SCIENCES
Framework
System
algorithm
interaction
presentation
Perception
usability
quality
appeal
Experience
system
process
outcome
Interaction
rating
consumption
retention
Personal CharacteristicsPersonal CharacteristicsPersonal Characteristicsgender privacy expertise
Situational CharacteristicsSituational CharacteristicsSituational Characteristicsroutine system trust choice goal
INFORMATION AND COMPUTER SCIENCES
FrameworkHas suggestions for measurement scales
e.g. How can we measure something like “satisfaction”?
Provides a good starting point for causal relations
e.g. How and why do certain system aspects influence the user experience?
Useful for integrating existing worke.g. How do recommendation list length, diversity and presentation each have an influence on user experience?
INFORMATION AND COMPUTER SCIENCES
Framework
“Statisticians, like artists, have the bad habit of falling in love with their models.”
George Box
MeasurementMeasuring subjective valuations
INFORMATION AND COMPUTER SCIENCES
Measurement
“To measure satisfaction, we asked users whether they liked the system
(on a 5-point rating scale).”
INFORMATION AND COMPUTER SCIENCES
Measurement
Does the question mean the same to everyone?
- John likes the system because it is convenient
- Mary likes the system because it is easy to use
- Dave likes it because the recommendations are good
A single question is not enough to establish content validity
We need a multi-item measurement scale
INFORMATION AND COMPUTER SCIENCES
Measurement
Measure at least 5 items per conceptAt least 2-3 should remain after removing bad items
It is most convenient to statements to which the participant can agree/disagree on a 5- or 7-point scale:
“I am satisfied with this system.”
Completely disagree, somewhat disagree, neutral, somewhat agree, completely agree
INFORMATION AND COMPUTER SCIENCES
MeasurementUse both positively and negatively phrased items
- They make the questionnaire less “leading”
- They help filtering out bad participants
- They explore the “flip-side” of the scale
The word “not” is easily overlookedBad: “The recommendations were not very novel”Better: “The recommendations were not very novel”Best: “The recommendations felt outdated”
INFORMATION AND COMPUTER SCIENCES
Measurement
Choose simple over specialized wordsParticipants may have no idea they are using a “recommender system”
Use as few words as possibleBut always use complete sentences
INFORMATION AND COMPUTER SCIENCES
Measurement
Use design techniques that help to improve recall and provide appropriate time references
Bad: “I liked the signup process” (one week ago)Good: Provide a screenshotBest: Ask the question immediately afterwards
Avoid double-barreled itemsBad: “The recommendations were relevant and fun”
INFORMATION AND COMPUTER SCIENCES
Measurement
“We asked users ten 5-point scale questions and summed the answers.”
INFORMATION AND COMPUTER SCIENCES
MeasurementIs the scale really measuring a single thing?
- 5 items measure satisfaction, the other 5 convenience
- The items are not related enough to make a reliable scale
Are two scales really measuring different things?
- They are so closely related that they actually measure the same thing
We need to establish convergent and discriminant validity
This makes sure the scales are unidimensional
INFORMATION AND COMPUTER SCIENCES
MeasurementSolution: factor analysis
- Define latent factors, specify how items “load” on them
- Factor analysis will determine how well the items “fit”
- It will give you suggestions for improvement
Benefits of factor analysis:
- Establishes convergent and discriminant validity
- Outcome is a normally distributed measurement scale
- The scale captures the “shared essence” of the items
INFORMATION AND COMPUTER SCIENCES
General privacy
concerns
Control concerns
Collection concerns
collct22
collct23
collct24
collct25
collct26
collct27
ctrl28
ctrl29
ctrl30
ctrl31
ctrl32
gipc16 gipc17 gipc18 gipc19 gipc21
1
1
1
Measurement
INFORMATION AND COMPUTER SCIENCES
General privacy
concerns
Control concerns
Collection concerns
collct22
collct24
collct25
collct26
collct27
ctrl28
ctrl29
gipc17 gipc18 gipc21
1
1
1
Measurement
AnalysisStatistical evaluation of results
INFORMATION AND COMPUTER SCIENCES
Analysis
Manipulation -> perception: Do these two algorithms lead to a different level of perceived quality?
T-test-1
-0.5
0
0.5
1
Perceived quality
A B
INFORMATION AND COMPUTER SCIENCES
Analysis
Perception -> experience:Does perceived quality influence system effectiveness?
Linear regression -2
-1
0
1
2
-3 0 3
System effectiveness
Recommendation quality
INFORMATION AND COMPUTER SCIENCES
perceived recommendation quality
SSA
perceived system effectiveness
EXP
personalized
recommendationsOSA
+
+
Analysis
Manipulation -> perception -> experience
Does the algorithm influence effectiveness via perceived quality?
Path model
INFORMATION AND COMPUTER SCIENCES
Analysis
Two manipulations -> perception:
What is the combined effect of list diversity and list length on perceived recommendation quality?
Factorial ANOVA 0
0.1
0.2
0.3
0.4
0.5
0.6
5 items 10 items 20 items
Perceived quality
low diversificationhigh diversification
Willemsen et al.: “Not just more of the same”, submitted to TiiS
INFORMATION AND COMPUTER SCIENCES
Analysis
Manipulation x personal characteristic -> outcome
Do experts and novices rate these two interaction methods differently in terms of usefulness?
ANCOVA-2
-1
0
1
2
-3 0 3
System usefulness
Domain knowledge
case-based PE attribute-based PE
Knijnenburg & WIllemsen: “Understanding the Effect of
Adaptive Preference Elicitation Methods“, RecSys2009
INFORMATION AND COMPUTER SCIENCES
Analysis
Use only one method: structural equation models
- A combination of Factor Analysis and Path Models
- The statistical method of the 21st century
- Statistical evaluation of causal effects
- Report as causal paths
INFORMATION AND COMPUTER SCIENCES
Analysis
We compared three recommender systems
Three different algorithms
MF-I and MF-E algorithms are more effective
Why?
��
����
����
����
����
����
���� ����� �����
���������� ���������
INFORMATION AND COMPUTER SCIENCES
�������
�������������������
���� ����� �����
����������� ���������������
����������������������������
���� ����� �����
����������������������� ����
��
����
����
����
����
����
���� ����� �����
���������� ���������
Analysis
The mediating variables show the entire story
Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012
INFORMATION AND COMPUTER SCIENCES
Number of items clicked in the
playerRating
of clicked itemsNumber of items
rated higher than the predicted rating
Number of items rated
1.250 (.183)p < .001
.310 (.148)p < .05
.356 (.175)p < .05
1.352 (.222)p < .001
.856 (.132)p < .001
20.094 (7.600)p < .01
.314 (.111)p < .01
1.865 (.729)p < .05
++
+ + +
+ +
+
+
participant's age
PC
perceived system effectiveness
EXP
.846 (.127)p < .001
Matrix Factorization recommender with explicit feedback (MF-E)(versus generally most popular; GMP)
OSA
Matrix Factorization recommender with implicit feedback (MF-I)
(versus most popular; GMP)OSA
perceived recommendation variety
SSA
perceived recommendation quality
SSA
Analysis
Knijnenburg et al.: “Explaining the user experience of recommender systems”, UMUAI 2012
INFORMATION AND COMPUTER SCIENCES
Analysis
“All models are wrong, but some are useful.”
George Box
Some examplesFindings based on the framework
INFORMATION AND COMPUTER SCIENCES
Some examples
Giving users many options:
- increases the chance they find something nice
- also increases the difficulty of making a decision
Result: choice overload
Solution:
- Provide short, diversified lists of recommendations
INFORMATION AND COMPUTER SCIENCES
movieexpertise
PC
++
+
+
−+
.455 (.211)p < .05
.181 (.075)p < .05
.503 (.090)p < .001
1.151 (.161)p < .001
.336 (.089)p < .001
-.417 (.125)p < .005
+
+ − +
.205 (.083)p < .05
.879 (.265)p < .001
.612 (.220)p < .01 -.804 (.230)
p < .001
.894 (.287)p < .005
perceived recommendation variety
SSA
perceived recommendation quality
SSA
Top-20
vs Top-5 recommendationsOSA
choicesatisfaction
EXP
choicedifficulty
EXP
Lin-20
vs Top-5 recommendationsOSA
Some examples
INFORMATION AND COMPUTER SCIENCES
�������
�������������������
�� ��� �� ���� ������
����������������
Some examples
INFORMATION AND COMPUTER SCIENCES
Some examples
-1
-0.5
0
0.5
1
1.5
5 items 10 items 20 items
Choice difficulty
low diversificationhigh diversification
0
0.1
0.2
0.3
0.4
0.5
0.6
5 items 10 items 20 items
Perceived quality
0
0.5
1
1.5
2
5 items 10 items 20 items
Choice satisfaction
INFORMATION AND COMPUTER SCIENCES
Some examples
Experts and novices differ in their decision-making process
- Experts want to leverage their domain knowledge
- Novices make decisions by choosing from examples
Result: they want different preference elicitation methods
Solution: Tailor the preference elicitation method to the userA needs-based approach seems to work well for everyone
INFORMATION AND COMPUTER SCIENCES
Some examples
INFORMATION AND COMPUTER SCIENCES
Some examples
-2
-1
0
1
2
-3 0 3
System usefulness
Domain knowledge
case-based PE attribute-based PE
INFORMATION AND COMPUTER SCIENCES
Some examples*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!Con
trol!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Und
erst
anda
bilit
y!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid! **!
***!
5!
10!
15!
20!
25!
30!
35!
40!
45!
-2! -1! 0! 1! 2!
Use
r int
erfa
ce s
atis
fact
ion!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
***!1! **!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Perc
eive
d sy
stem
effe
ctiv
enes
s!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
*!1!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Cho
ice
satis
fact
ion!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!Con
trol!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Und
erst
anda
bilit
y!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid! **!
***!
5!
10!
15!
20!
25!
30!
35!
40!
45!
-2! -1! 0! 1! 2!
Use
r int
erfa
ce s
atis
fact
ion!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
***!1! **!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Perc
eive
d sy
stem
effe
ctiv
enes
s!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
*!1!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Cho
ice
satis
fact
ion!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
INFORMATION AND COMPUTER SCIENCES
Some examples
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!Con
trol!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!Con
trol!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
INFORMATION AND COMPUTER SCIENCES
Some examples
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Und
erst
anda
bilit
y!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Und
erst
anda
bilit
y!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
INFORMATION AND COMPUTER SCIENCES
Some examples
**!
***!
5!
10!
15!
20!
25!
30!
35!
40!
45!
-2! -1! 0! 1! 2!
Use
r int
erfa
ce s
atis
fact
ion!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!**!
***!
5!
10!
15!
20!
25!
30!
35!
40!
45!
-2! -1! 0! 1! 2!
Use
r int
erfa
ce s
atis
fact
ion!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
INFORMATION AND COMPUTER SCIENCES
Some examples***!1! **!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Perc
eive
d sy
stem
effe
ctiv
enes
s!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
***!1! **!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Perc
eive
d sy
stem
effe
ctiv
enes
s!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
INFORMATION AND COMPUTER SCIENCES
Some examples*!1!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Cho
ice
satis
fact
ion!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
*!1!
*!
-2!
-1!
0!
1!
2!
-2! -1! 0! 1! 2!
Cho
ice
satis
fact
ion!
Domain Knowledge!
TopN!Sort!Explicit!Implicit!Hybrid!
INFORMATION AND COMPUTER SCIENCES
Some examples
Social recommenders limit the neighbors to the users’ friends
Result:
- More control: friends can be rated as well
- Better inspectability: Recommendations can be traced
We show that each of these effects existThey both increase overall satisfaction
INFORMATION AND COMPUTER SCIENCES
Some examplesRate likes Weigh friends
INFORMATION AND COMPUTER SCIENCES
Some examples
INFORMATION AND COMPUTER SCIENCES
Some examples
User Experience (EXP)Objective System Aspects (OSA)
Subjective System Aspects (SSA)
+
+
+
++
+
Understandability
(R2 = .153)
Satisfaction with the system
(R2 = .696)
Perceived control
(R2 = .311)
Interaction (INT)
Average rating(R2 = .508)
Inspection time (min)(R2 = .092)
+
−
+ ++
−
+ +
+
0.166 (0.077)*
Perceived recommendation
quality(R2 = .512)
Controlitem/friend vs. no control
Inspectabilityfull graph vs. list only
number of known recommendations
(R2 = .044)
−0.332 (0.088)***
0.257(0.124)*
0.205(0.100)*
0.375(0.094)***
−0.152 (0.063)*
0.410 (0.092)***
0.955 (0.148)***
0.231(0.114)*
0.377(0.074)***
0.770(0.094)***
0.249(0.049)***
0.695 (0.304)* 0.067 (0.022)**
0.323 (0.031)***
0.148(0.051)**
!2(2) = 10.70**item: 0.428 (0.207)*friend: 0.668 (0.206)**
!2(2) = 10.81**item: −0.181 (0.097)1
friend: −0.389 (0.125)**
0.288 (0.091)**
0.459 (0.148)**
Personal Characteristics (PC)
+++
−+
Music expertise
Familiarity with recommenders
Trusting propensity
IntroductionI discussed how to do user experiments with my framework
Beyond the 5 starsThe Netflix approach is a step in the right direction
Towards user experienceWe need to measure subjective valuations as well
Evaluation frameworkWith the framework you can put this all together
MeasurementMeasure subjective valuations with questionnaires and factor analysis
AnalysisUse structural equation models to test comprehensive causal models
Some examplesI demonstrated the findings of some studies that used this approach
See you in Dublin!
www.usabart.nl
@usabart
INFORMATION AND COMPUTER SCIENCES
ResourcesUser-centric evaluation
Knijnenburg, B.P., Willemsen, M.C., Gantner, Z., Soncu, H., Newell, C.: Explaining the User Experience of Recommender Systems. UMUAI 2012.
Knijnenburg, B.P., Willemsen, M.C., Kobsa, A.: A Pragmatic Procedure to Support the User-Centric Evaluation of Recommender Systems. RecSys 2011.
Choice overloadWillemsen, M.C., Graus, M.P., Knijnenburg, B.P., Bollen, D.: Not just more of the same: Preventing Choice Overload in Recommender Systems by Offering Small Diversified Sets. Submitted to TiiS.
Bollen, D.G.F.M., Knijnenburg, B.P., Willemsen, M.C., Graus, M.P.: Understanding Choice Overload in Recommender Systems. RecSys 2010.
INFORMATION AND COMPUTER SCIENCES
ResourcesPreference elicitation methods
Knijnenburg, B.P., Reijmer, N.J.M., Willemsen, M.C.: Each to His Own: How Different Users Call for Different Interaction Methods in Recommender Systems. RecSys 2011.
Knijnenburg, B.P., Willemsen, M.C.: The Effect of Preference Elicitation Methods on the User Experience of a Recommender System. CHI 2010.
Knijnenburg, B.P., Willemsen, M.C.: Understanding the effect of adaptive preference elicitation methods on user satisfaction of a recommender system. RecSys 2009.
Social recommendersKnijnenburg, B.P., Bostandjiev, S., O'Donovan, J., Kobsa, A.: Inspectability and Control in Social Recommender Systems. RecSys 2012.
INFORMATION AND COMPUTER SCIENCES
ResourcesUser feedback and privacy
Knijnenburg, B.P., Kobsa, A: Making Decisions about Privacy: Information Disclosure in Context-Aware Recommender Systems. Submitted to TiiS.
Knijnenburg, B.P., Willemsen, M.C., Hirtbach, S.: Getting Recommendations and Providing Feedback: The User-Experience of a Recommender System. EC-Web 2010.