View
161
Download
2
Category
Tags:
Preview:
DESCRIPTION
The aim of the Experience Discovery project is to recommend extracurricular activities to high school and middle school students in urban areas. In implementing this system, we have been able to make use of both usage data and data drawn from a social networking site. Using pilot data, we are able to show that very simple aggregation techniques applied to the social network can improve recommendation accuracy.
Citation preview
Experience Discovery: Hybrid Recommendation of Student Activities using Social Network Data
Robin Burke, Yong Zheng, Scott Riley
Web Intelligence Laboratory
College of Computing and Digital Media
DePaul University
Problem Service organizations offer many educational
programs and activities for youth
Participation (especially by underprivileged youth) is low
Even though these are the individuals who would benefit the most
How to get better participation?
not just a recommendation problem
The Role of
Recommendation
Need for personalization
Many diverse activities
from basketball to poetry to robots to knitting
Low tolerance for imprecise results
Need for system initiative
user research shows that students are unlikely to
search and browse
To “pull” opportunities
system should “push” suggestions
we are considering mobile platforms
Partners Digital Youth Network
service organization focused on the creation of digital media
Nichole Pinkard
YouMedia
school-based online social network
affiliated with DYN
Chicago Learning Network
consortium of museums and non-profits
Chicago Public Schools
Funders
MacArthur Foundation
Gates Foundation
Experience Discovery:
Research opportunities
Full cycle observation
activity enrollments
activity attendance
click-through
post-activity rating, tagging, reviewing
Social network data
uploading of digital media
browsing / commenting behavior
friend / follower connections
Research question 1
There are multiple important knowledge sources
past enrollment history
content data
social network data
log data
Mixed vs integrated hybrid recommendation
should different knowledge sources be integrated in
making recommendations?
or should recommendations of different types be
presented side-by-side?
Research question 2 Activities sometimes have a logical planned sequence
Video editing I -> Video editing II
Sometimes they are sequenced idiosyncratically
Digital photography -> Zoo explorer I
Educational goal
increase both depth and breadth of student participation
The role of “curricula”
how can recommendations be used to increase both breadth and depth of student involvement?
what is the role of top-down vs bottom-up sequences in recommendation?
Research question 3
Dynamics of interest
students mature a lot between 11 and 18
old activities may lose their appeal
Dynamics of offerings
activities change from year to year and season to season
may not be explicit
Coping with change
how can we ensure that recommendations don’t lag student interest?
how to detect and respond to program changes?
Research question 4
Students aren’t the only ones with questions
Service providers can get value, too
what activities should I offer and where?
how do my offerings compare to other groups?
what needs are not being met?
Analytics and recommendations for service
providers
what can we provide that is helpful and
comprehensible?
Architecture
! "#$%&' $( )*+,- $. / +). ,
! "#$%&$( 0$,- $01' ' $( 2*31( ,4+*51%' ,
6037&)8,9*)*,
: 10&*+,; $)< 1%=,9*)*,
>( #/ ),?*0@$,
- $01' ' $( 2*31( ,! ( A&( $,
- $. / +),?*0@$, ! "#$%&' $( )*+,
?1( BA/%*31( . ,
?+&$( ),6##+&0*31( ,
! 7*+/*31( ,>( )$%C*0$,
D#$%*31( *+,>( )$%C*0$,6+A1%&)@' ,E&F%*%8,
6G$( 2*( 0$,9*)*,
Initial experiments
Data (2 schools)
226 students
32 activities
3800 records
(now adding ~2000 enrollments and ~50 activities / month)
Algorithms
collaborative / binary
collaborative / pseudo rating
content-collaborative meta-level hybrid
plus behavioral descriptors
Pseudo-ratings
Some activities are attended multiple times
evidence of strong interest
Example
book discussion group
Normalize to user’s profile
weight for activity a = # of times attending a / total
attendances
Can we normalize in other ways?
take into account how often something was offered
Meta-level hybrid
Use course topic descriptors
13 choices
health, music, visual arts, etc.
activities may include several topics
Build a topic profile by summing over descriptions
of all activities
Compare users based on topic profiles
rather than attendance data
Adding social network data
Extracted 10 features from the social network
counts of uploaded media types
overall level of activity
Used feature combination
content profile
behavior profile
! "#$%&' () %*+, #"- .' %) / 0".
! "#$%&' () %#.
1"2, 3&) %*+, #"- .' %) / 0".
4) $&, 0.5 "(6 ) %7.
1"2, 3&) %.- , (, .
8$93&(: .! , (, .
Results
Temporal leave-one-out evaluation
see Burke, 2010
Look at a user’s experience over time
looking at users divided by
# of enrollments (profile size)
profile diversity (# of different enrollments)
Need to do more research
Hybrid 2 works best for large, diverse users
Doesn’t matter what you do for non-diverse users
Conclusions
We are in the early stages here
Eager to get our hands on bigger data
Many research questions
Would like to hear ideas
Thanks
Questions / Comments / Ideas
Recommended