1. Inferring Age and Gender of Facebook Users Based on their
Status Updates Angel Oswaldo Vzquez-Patio Master of Science in
Artificial Intelligence February, 2015, Leuven, Belgium
3. Introduction: Social Media Users Kemp, Simon, 2014. Global
Social Media Users Pass 2 Billion. We Are Social.
4. Introduction: Facebook Penetration Kemp, Simon, 2014. Global
Social Media Users Pass 2 Billion. We Are Social.
5. Introduction: Computational social science
6. Introduction: Importance Social Media Marketing Important
Attributes: Age and Gender Attribute Disclosure
7. Goal of the study Age and gender inference models Reduce the
feature dimension Second Order Representation (SOR)
8. Literature review Study of Kosinski et al., 2013 relying on
Facebook likes The Open Vocabulary Approach (Schwartz et al., 2013)
General approach Extraction of features User representation
Classification model
11. The Open Vocabulary Approach Linguistic Feature Extraction
n-grams of 1 to 3 words PMI greater that 2*length Terms used by 1%
of users Feature Dimension Reduction PCA Representation BOT
31,169
12. The Second Order Representation 1. Building term vectors 2.
Building document vectors
13. Methodology Gender prediction SVMs: Linear and RBF kernels
Age prediction Ridge regression Lasso regression
14. Results 1. OVA-PCA-DR 2. OVA-No-DR 3. OVA-CHI2-DR 4. SOR 5.
SOR-CHI2-DR Classification Accuracy F1-score Regression R MAE MSE
EVS
21. Conclusions and future work Age and gender inference models
Reduce the feature dimension X2 15K terms Second Order
Representation (SOR) Reduce running time dramatically, age PAN 2015
workshop and competition Author Profiling