If you can't read please download the document
Upload
xavier-amatriain
View
1.568
Download
0
Embed Size (px)
Citation preview
It's all about the Data...
Data-driven Approaches to the
Recommendation Problem
Xavier Amatriain
Telefonica Research
But first...
About Telefonica and Telefonica R&D
About 71,000 professionalsAbout 257,000 professionals
StaffServicesFinancesRev: 4,273 MEPS(1): 0.45 Integrated ICT solutions for all customers
ClientsAbout 12 million subscribersAbout 260 million customersBasic telephone and data services1989
SpainOperations in 25 countries
Geographies
Rev: 57,946 M EPS: 1.63
20002008
About 149,000 professionalsAbout 68 million customersWireline and mobile voice, data and Internet services
(1) EPS: Earnings per shareRev: 28,485 MEPS(1): 0.67 Operations in16 countries
Telefonica is a fast-growing Telecom
Telco sector worldwide ranking by market cap (US$ bn)Currently among the largest in the worldSource: Bloomberg, 06/12/09
Argentina: 20.9 millionBrazil: 61.4 millionCentral America: 6.1 millionColombia: 12.6 millionChile: 10.1 millionEcuador: 3.3 million Mexico: 15.7 millionPeru: 15.2 millionUruguay: 1.5 millionVenezuela: 12.0 million
Wireline market rank Mobile market rank
2
1
1
2
2
1
1
1
2
2
1
1
1
2
2
Notes: - Central America includes Guatemala, Panama, El Salvador and Nicaragua- Total accesses figure includes Narrowband Internet accesses of Terra Brasil and Terra Colombia, and Broadband Internet accesses of Terra Brasil, Telefnica de Argentina, Terra Guatemala and Terra Mxico.Data as of March 09Total Accesses (as of March 09)159.5 millionLeader in South America
En estos aos, Telefnica ha consolidado su liderazgo en Latinoamrica,
Spain: 47.2 millionUK: 20.8 millionGermany: 16.0 millionIreland: 1.7 millionCzech Republic: 7.7 millionSlovakia: 0.4 million
Total Accesses (as of March 09)93.8 million1
2
1
1
1
4
2
Wireline market rankMobile market rank
3
Data as of March 09And a significant footprint in Europe
ha conseguido una escala relevante en Europa
Telefonica R&D (TID) is the Research and Development Unit of the Telefnica Group
MISSIONTo contribute to the improvement of the Telefnica Groups competitivness through technological innovationFounded in 1988
Largest private R&D center in Spain
More than 1100 professionals
Five centers in Spain and two in Latin America
Telefnica was in 2008 the first Spanish company by R&D Investment and the third in the EU
Products / Services / Processes developmentTechnological Innovation (1)R&D594 M 4.384 M
Applied researchR&D61 M
TID Scientific Groups: Publications, Patents, TechTransfer
TI+D Scientific Groups
Pablo Rodriguez
Internet Scientific Director
Nuria Oliver
Multimedia Scientific Director
Data Mining and User Modeling
Acting Scientific Director
Internet Scientific Areas
Content Distribution and P2PNext generation Managed P2P-TV
Future Internet: Content Networking
Delay Tolerant Bulk Distribution
Network Transparency
Social NetworksInformation Propagation
Social Search Engines
Infrastructure for Social based cloud computing
Wireless and Mobile SystemsWireless bundling
Device2Device Content Distribution
Large Scale mobile data analysis
Multimedia Scientific Areas
Multimedia CoreMultimedia Data Analysis, Search & Retrieval
Video, Audio, Image, Music, Text, Sensor Data
Understanding, Summarization, Visualization
Mobile and UbicompContext Awareness
Urban Computing
Mobile Multimedia & Search
Wearable Physiological Monitoring
HCCMultimodal User Interfaces
Expression, Gesture, Emotion Recognition
Personalization & Recommendation Systems
Super Telepresence
Data Mining & User Modeling Areas
DATA MININGIntegration of statistical & knowledge-based techniques
- Stream mining
Large scale & distributed machine learning
USER MODELING Application to new services (technology for development)
Cognitive, socio-cultural, and contextual modeling
Behavioral user modeling (service-use patterns)
SOCIAL NETWORK ANALYSYS & BUSINESS INT. Analytical CRM
Trend-spotting, service propagation & churn
Social Graph Analysis (construction, dynamics)
I like it... I like it not
Evaluating User Ratings Noise in
Recommender Systems
Xavier Amatriain (@xamat), Josep M. Pujol, Nuria Oliver
Telefonica Research
Recommender Systems are everywhere
Netflix: 2/3 of the movies rented were recommended
Google News: recommendations generate 38% more clickthrough
Amazon: 35% sales from recommendations
We are leaving the age of Information and entering the Age of Recommendation - The Long Tail (Chris Anderson)
The Netflix Prize
500K users x 17K movie titles = 100M ratings = $1M (if you only improve existing system by 10%! From 0.95 to 0.85 RMSE)
This is what Netflix thinks a 10% improvement is worth for their business
49K contestants on 40K teams from 184 countries.
41K valid submissions from 5K teams; 64 submissions in the last 24 hours
But, is there a limit to RS accuracy?
Evolution of accuracy in Netflix Prize
The Magic Barrier
Magic Barrier = Limit on prediction accuracy due to noise in original data
Natural Noise = involuntary noise introduced by users when giving feedback
Due to (a) mistakes, and (b) lack of resolution in personal rating scale (e.g. In a 1 to 5 scale a 2 may mean the same than a 3 for some users and some items).
Magic Barrier >= Natural Noise Threshold
We cannot predict with less error than the resolution in the original data
The Question in the Wind
Our related research questions
Q1. Are users inconsistent when providing explicit feedback to Recommender Systems via the common Rating procedure?
Q2. How large is the prediction error due to these inconsistencies?
Q3. What factors affect user inconsistencies?
Experimental Setup (I)
Test-retest procedure: you need at least 3 trials to separate
Reliability: how much you can trust the instrument you are using (i.e. ratings)
r = r12r23/r13
Stability: drift in user opinion
s12=r13/r23; s23=r13/r12; s13=r13/r12r23
Users rated movies in 3 trials
Trial 1 24 h Trial 2 15 days Trial 3
Experimental Setup (II)
100 Movies selected from Netflix dataset doing a stratified random sampling on popularity
Ratings on a 1 to 5 star scale
Special not seen symbol.
Trial 1 and 3 = random order; trial 2 = ordered by popularity
118 participants
Results
Comparison to Netflix Data
Distribution of number of ratings per movie very similar to Netflix but average rating is lower (users are not voluntarily choosing what to rate)
Test-retest Stability and Reliability
Overall reliability = 0.924 (good reliabilities are expected to be > 0.9)
Removing mild ratings yields higher reliabilities, while removing extreme ratings yields lower
Stabilities: s12 = 0.973, s23 = 0.977, and s13 = 0.951
Stabilities might also be accounting for learning effect (note s12