It's all About the Data

It's all about the Data...

Data-driven Approaches to the

Recommendation Problem

Xavier Amatriain

Telefonica Research

But first...

About Telefonica and Telefonica R&D

About 71,000 professionalsAbout 257,000 professionals

StaffServicesFinancesRev: 4,273 MEPS(1): 0.45 Integrated ICT solutions for all customers

ClientsAbout 12 million subscribersAbout 260 million customersBasic telephone and data services1989

SpainOperations in 25 countries

Geographies

Rev: 57,946 M EPS: 1.63

20002008

About 149,000 professionalsAbout 68 million customersWireline and mobile voice, data and Internet services

(1) EPS: Earnings per shareRev: 28,485 MEPS(1): 0.67 Operations in16 countries

Telefonica is a fast-growing Telecom

Telco sector worldwide ranking by market cap (US$ bn)Currently among the largest in the worldSource: Bloomberg, 06/12/09

Argentina: 20.9 millionBrazil: 61.4 millionCentral America: 6.1 millionColombia: 12.6 millionChile: 10.1 millionEcuador: 3.3 million Mexico: 15.7 millionPeru: 15.2 millionUruguay: 1.5 millionVenezuela: 12.0 million

Wireline market rank Mobile market rank

2

1

1

2

2

1

1

1

2

2

1

1

1

2

2

Notes: - Central America includes Guatemala, Panama, El Salvador and Nicaragua- Total accesses figure includes Narrowband Internet accesses of Terra Brasil and Terra Colombia, and Broadband Internet accesses of Terra Brasil, Telefnica de Argentina, Terra Guatemala and Terra Mxico.Data as of March 09Total Accesses (as of March 09)159.5 millionLeader in South America

En estos aos, Telefnica ha consolidado su liderazgo en Latinoamrica,

Spain: 47.2 millionUK: 20.8 millionGermany: 16.0 millionIreland: 1.7 millionCzech Republic: 7.7 millionSlovakia: 0.4 million

Total Accesses (as of March 09)93.8 million1

2

1

1

1

4

2

Wireline market rankMobile market rank

3

Data as of March 09And a significant footprint in Europe

ha conseguido una escala relevante en Europa

Telefonica R&D (TID) is the Research and Development Unit of the Telefnica Group

MISSIONTo contribute to the improvement of the Telefnica Groups competitivness through technological innovationFounded in 1988

Largest private R&D center in Spain

More than 1100 professionals

Five centers in Spain and two in Latin America

Telefnica was in 2008 the first Spanish company by R&D Investment and the third in the EU

Products / Services / Processes developmentTechnological Innovation (1)R&D594 M 4.384 M

Applied researchR&D61 M

TID Scientific Groups: Publications, Patents, TechTransfer

TI+D Scientific Groups

Pablo Rodriguez
Internet Scientific Director

Nuria Oliver
Multimedia Scientific Director
Data Mining and User Modeling
Acting Scientific Director

Internet Scientific Areas

Content Distribution and P2PNext generation Managed P2P-TV

Future Internet: Content Networking

Delay Tolerant Bulk Distribution

Network Transparency

Social NetworksInformation Propagation

Social Search Engines

Infrastructure for Social based cloud computing

Wireless and Mobile SystemsWireless bundling

Device2Device Content Distribution

Large Scale mobile data analysis

Multimedia Scientific Areas

Multimedia CoreMultimedia Data Analysis, Search & Retrieval

Video, Audio, Image, Music, Text, Sensor Data

Understanding, Summarization, Visualization

Mobile and UbicompContext Awareness

Urban Computing

Mobile Multimedia & Search

Wearable Physiological Monitoring

HCCMultimodal User Interfaces

Expression, Gesture, Emotion Recognition

Personalization & Recommendation Systems

Super Telepresence

Data Mining & User Modeling Areas

DATA MININGIntegration of statistical & knowledge-based techniques

- Stream mining

Large scale & distributed machine learning

USER MODELING Application to new services (technology for development)

Cognitive, socio-cultural, and contextual modeling

Behavioral user modeling (service-use patterns)

SOCIAL NETWORK ANALYSYS & BUSINESS INT. Analytical CRM

Trend-spotting, service propagation & churn

Social Graph Analysis (construction, dynamics)

I like it... I like it not

Evaluating User Ratings Noise in

Recommender Systems

Xavier Amatriain (@xamat), Josep M. Pujol, Nuria Oliver

Telefonica Research

Recommender Systems are everywhere

Netflix: 2/3 of the movies rented were recommended

Google News: recommendations generate 38% more clickthrough

Amazon: 35% sales from recommendations

We are leaving the age of Information and entering the Age of Recommendation - The Long Tail (Chris Anderson)

The Netflix Prize

500K users x 17K movie titles = 100M ratings = $1M (if you only improve existing system by 10%! From 0.95 to 0.85 RMSE)

This is what Netflix thinks a 10% improvement is worth for their business

49K contestants on 40K teams from 184 countries.

41K valid submissions from 5K teams; 64 submissions in the last 24 hours

But, is there a limit to RS accuracy?

Evolution of accuracy in Netflix Prize

The Magic Barrier

Magic Barrier = Limit on prediction accuracy due to noise in original data

Natural Noise = involuntary noise introduced by users when giving feedback

Due to (a) mistakes, and (b) lack of resolution in personal rating scale (e.g. In a 1 to 5 scale a 2 may mean the same than a 3 for some users and some items).

Magic Barrier >= Natural Noise Threshold

We cannot predict with less error than the resolution in the original data

The Question in the Wind

Our related research questions

Q1. Are users inconsistent when providing explicit feedback to Recommender Systems via the common Rating procedure?

Q2. How large is the prediction error due to these inconsistencies?

Q3. What factors affect user inconsistencies?

Experimental Setup (I)

Test-retest procedure: you need at least 3 trials to separate

Reliability: how much you can trust the instrument you are using (i.e. ratings)

r = r12r23/r13

Stability: drift in user opinion

s12=r13/r23; s23=r13/r12; s13=r13/r12r23

Users rated movies in 3 trials

Trial 1 24 h Trial 2 15 days Trial 3

Experimental Setup (II)

100 Movies selected from Netflix dataset doing a stratified random sampling on popularity

Ratings on a 1 to 5 star scale

Special not seen symbol.

Trial 1 and 3 = random order; trial 2 = ordered by popularity

118 participants

Results

Comparison to Netflix Data

Distribution of number of ratings per movie very similar to Netflix but average rating is lower (users are not voluntarily choosing what to rate)

Test-retest Stability and Reliability

Overall reliability = 0.924 (good reliabilities are expected to be > 0.9)

Removing mild ratings yields higher reliabilities, while removing extreme ratings yields lower

Stabilities: s12 = 0.973, s23 = 0.977, and s13 = 0.951

Stabilities might also be accounting for learning effect (note s12

Technology

It's all About the Data