1. Recommender as an example Steven Chiu RD department Vpon
Inc.
2. Outline Background, challenges and KPIs Basic concept
Challenges and KPIs Vpon Ad service infrastructure AD effectiveness
related work Recommender System flows Summary Q&A
3. Basic concept Vpon Ad service infrastructure Challenges and
KPIs
4. Typical use case Clicks Conversions The media Landing pages
ADs
5. Ads on Vpon Mainly for Navigation apps, e.g. Navidog POI
(Map) POI (Banner) Normal
6. Full screen ads Video ads Ads on Vpon
7. AD Performance Evaluation Click Through Rate (CTR)
Conversion Rate Goals To maximize CTR To maximize conversations
Click Conversion Impression
8. Integration Apps Placing Ads Charged in CPC, CPM Criteria:
time, locations, app categories, budget, Performance reports
Advertisers app App reports app app Mobile app users Mobile app
publishers Advertisers Ad performance reports
9. Vpon AD services backend Data Archiving & Analysis User
Context Runtime information Users Ad Requests Ad Serving Scalable
AD Serving Transaction & Billing Real-time Ad Selection
UserScenario Modeling Data Mining MR/Spark HBase HDFS Ad-hoc
Analytics Reporting & Data Warehouse Adaptive AD Distribution
System Continues Improvement Ad performance P3
10. 60+ M Monthly Active Unique Devices 200+ M of Daily Ad
Requests 2+ T Ad transaction records over time 25+ M Cell
Towers/Wi-Fi AP Location Data Some numbers for Vpon AD Network P2
Taipei, Shanghai, HK, Bejing and Tokyo 2 IDCs at Taipei, Shanghai
and Some Amazon EC2 nodes
11. Data Analysis Ad Requests Ad web service Backend Memory
cache In- memory Grid HBase MapReduce/Spark HA Proxy Message
Routing (Apache Kafka) Ad Request Cue Backend Hadoop Distributed
File System (HDFS) User Profiles Ad Requests HTTP POST Avro Avro
Avro Ad videos, images HTTP Get Data Processing and Archiving
Creative and videos AD management Report UI (Django, SSH) Vpon AD
services backend functionsCDN Recommender System Other undergoing
topics Reporting system Sales Support System AD-hoc reporting
Operation Ganglia Solr AD Operation AD Monitoring System Scenario
modeling Avro Web Proxy + Cache Memory cache In- memory Grid Cue
User Profiles (Couch DB and HBase) Rsync, Avro Avro Python + pig,
hive, Hadoop Streaming, spark Python + pig, hive, Hadoop Streaming,
spark Advertisers
12. Recommender as an example Design and Implementation
13. Recommender Types User(imei) based recommender system
Item(ad) based recommender system Steps Step1: Campaign/AD
similarity table Step2: Prediction Phase Step3: Verification Phase
(Continuous Improvement)
14. Serve ads according to users preference Recommender flow
Prediction Machine Learning (e.g. recommender) Evaluation Data
Selection Select user records of the Ad Click/Conversion action by
different kinds of Apps Select users logs of the Location,
Date/Time, Usage Freq., Area, Movement Speed Identify relation of
the conversion types, App info, Ad info and user info to best
choose configurations Campaign/AD similarity calculation User
preferences Advertising in accordance with the identified targeted
users Feedback the AD execution results into the system for
adjusting the modeling adaptively P5
15. Ad 1 Ad 2 Ad 3 Ad 4 Ad N User 1 0 0 1 0 0 User 2 1 1 0 1 0
User 3 1 1 1 1 1 User 4 1 1 0 0 0 User N Step1: Ads' Similarities
Unique device IDs from latest K months Historical and ongoing ads
(App downloads as conversions)
16. Ad 1 Ad 2 Ad 3 Ad 4 Ad N User 1 P(1,1) P(1,2) P(1,3) P(1,4)
P(1,5) User 2 P(2,1) P(2,2) P(2,3) P(2,4) P(2,5) User 3 P(3,1)
P(3,2) P(3,3) P(3,4) P(3,5) User 4 P(4,1) 1P(4,2) P(4,3) P(4,4)
P(4,5) User Z Step2: Users' Preferences Unique device IDs from
latest K months Historical and ongoing ads (App downloads as
conversions)
17. User 1 User 2 Step3: Prediction Phase: ADs sorted by
preference
18. Data Analysis Ad Requests Ad web service Backend Memory
cache In- memory Grid HBase MapReduce/Spark HA Proxy Message
Routing (Apache Kafka) Ad Request Cue Backend Hadoop Distributed
File System (HDFS) User Profiles Ad Requests HTTP POST Avro Avro
Avro Ad videos, images HTTP Get Data Processing and Archiving
Creative and videos Billing System CDN Recommender System Other
undergoing topics Reporting system Sales Support System AD-hoc
reporting Operation Ganglia Solr AD Operation AD Monitoring System
Scenario modeling Avro Billing Proxy + Cache Memory cache In-
memory Grid Cue User Profiles (Couch DB and HBase) Rsync, Avro Avro
Step3: Prediction Phase: Serving Ads based on Preferences user1
ad1,ad2, ad5 user2 ad2,ad4, ad5 user3 ad4,ad5,ad6,ad8 user1
Persisted on Apache CouchDB Replicated to in-memory grid
21. Implementation Hadoop MapReduce as computing platform Using
Hadoop streaming with Python Map: a list of ad pairs as input for
similarity caculation Reduce: simply aggregate the map results
Re-modeling on a daily basis based on results Will go on to use
Haoop HDFS + Spark + Python for performance benefit
22. Summary Build the infra. that proves models effective or
not as early as possible AB testing for new models Automate as much
as possible Monitoring and measurement Computing resource Properly
manage Product, ad-hoc, analysis jobs Optimization does work Use
Python wherever it fits