SMP Challenge: An Overview of Social Media Prediction ... · 3 SOCIAL MEDIA PREDICTION DATASET OVERVIEW Social Media Prediction Dataset (SMPD) 1 is a large-scale bench-mark for social

SMP Challenge: An Overview of Social Media PredictionChallenge 2019

Bo [email protected] University

Wen-Huang [email protected]

National Chiao Tung University

Peiye [email protected]

Beijing University of Posts andTelecommunications

Bei [email protected] Research Asia

Zhaoyang [email protected]

Sun Yat-sen University

Jiebo [email protected] of Rochester

ABSTRACT“SMP Challenge” aims to discover novel prediction tasks for numer-ous data on social multimedia and seek excellent research teams.Making predictions via social multimedia data (e.g. photos, videos ornews) is not only helps us to make better strategic decisions for thefuture, but also explores advanced predictive learning and analyticmethods on various problems and scenarios, such as multimediarecommendation, advertising system, fashion analysis etc.

In the SMP Challenge at ACM Multimedia 2019, we introducea novel prediction task Temporal Popularity Prediction, which fo-cuses on predicting future interaction or attractiveness (in termsof clicks, views or likes etc.) of new online posts in social mediafeeds before uploading. We also collected and released a large-scaleSMPD benchmark with over 480K posts from 69K users. In thispaper, we define the challenge problem, give an overview of thedataset, present statistics of rich information for data and annota-tion and design the accuracy and correlation evaluation metrics fortemporal popularity prediction to the challenge.

CCS CONCEPTS• Information systems → Multimedia and multimodal re-trieval; Online advertising; • Human-centered computing →Social media; • Computing methodologies→ Computer visiontasks.

KEYWORDSSocial Multimedia, Visual Prediction, Popularity Prediction

ACM Reference Format:Bo Wu, Wen-Huang Cheng, Peiye Liu, Bei Liu, Zhaoyang Zeng, and JieboLuo. 2019. SMP Challenge: An Overview of Social Media Prediction Chal-lenge 2019. In Proceedings of the 27th ACM International Conference onMultimedia (MM ’19), October 21–25, 2019, Nice, France. ACM, New York,NY, USA, 5 pages. https://doi.org/10.1145/3343031.3356084

Permission to make digital or hard copies of all or part of this work for personal orclassroom use is granted without fee provided that copies are not made or distributedfor profit or commercial advantage and that copies bear this notice and the full citationon the first page. Copyrights for components of this work owned by others than ACMmust be honored. Abstracting with credit is permitted. To copy otherwise, or republish,to post on servers or to redistribute to lists, requires prior specific permission and/or afee. Request permissions from [email protected] ’19, October 21–25, 2019, Nice, France© 2019 Association for Computing Machinery.ACM ISBN 978-1-4503-6889-6/19/10. . . $15.00https://doi.org/10.1145/3343031.3356084

Figure 1: SMP Challenge introduces Temporal PopularityPrediction task for social multimedia. SMPD includes vi-sual content (diverse images with categories into a semantictaxonomy), textual content (e.g. title and custom tags) andspatial-temporal content (e.g. location and time). The popu-larity score in the figure is prediction target and calculatedby “user interactions” of online post.

1 INTRODUCTIONPeople are interested in predicting the future. For example, whichwho will win the upcoming Grammy Awards or which film will beprevalent in next few weeks? Making predictions about the futurebrings real values to a variety of applications and scenarios [15],such as multimedia recommendation [9, 17, 20], advertising sys-tem [13, 30], fashion analysis [5, 14], topic mining [4, 22, 27] etc.

Therefore, the purpose of SMP Challenge is to discover novelchallenge tasks based on numerous resources on social multimediaand seek excellent research teams who are capable of making theprediction. For prediction, the increasing ubiquity of social media(e.g. Facebook, Twitter, Flickr, YouTube, etc.) provides a crucial wayfor learning about the real-world. Meanwhile, social multimediadata increased interest for researches in study of exploring rich so-cial facts and knowledge with multi-modal information (e.g. images,text, video, events, etc.), while social media is now globally ubiqui-tous and prevalent. So far the researches of social media predictioncovered in several significant areas of multimedia and artificialintelligence, and closely integrated with computer vision, machinelearning, natural language and human-centered interaction.

During this year, the task of SMP Challenge 2019 is TemporalPopularity Prediction [22, 25], addressing the problem to predictthe future popularity of giving posts before they were publishedin social media. This treats popularity prediction at a time-relatedprediction problem [10, 21], and formulates popularity by online

arX

iv:1

910.

0179

5v2

[cs

.MM

] 2

1 Ja

n 20

20

https://doi.org/10.1145/3343031.3356084

https://doi.org/10.1145/3343031.3356084

Figure 2: The histogram of popularity score. Each scorerange collects the posts which have a popularity scorewithin the range [x-0.5,x].

attention based on various user interactions (e.g. clicks, visits, re-views). To achieve this goal, the participated teams need to designnew algorithms of understanding and learning techniques, and au-tomatically predict with considering post content, future post timeand its multiple multimedia information (as shown in Figure 1) ina time-related dynamic system [11, 19, 29].

In the literature, several large-scale datasets from social mediahave been established for various research tasks and helped leadto great advancements in multimedia technology and applications,such as YFCC [8], Yelp2016 [1], Visual Genome [12], etc. However,most of the existing datasets are limited in the diversity of cover-age, i.e. the collected data are often biased to the particular taskin question, and lacking cross-task generalization. Therefore, weintroduced Social Media Prediction Dataset (SMPD), a large-scalebenchmark dataset for sociological understanding and predictionswith over 486k posts and 80K users in total. SMPD collects multi-faceted information of a post, such as user profile, photo metadata,and visual content. Particularly, we aim to record the temporalorder of social media data. For example, social media posts in thedataset are obtained with temporal information to preserve thecontinuity of post sequences. Our goal is to make the SMPD asvaried and rich as possible to thoroughly represent the social media“world”.

2 TEMPORAL POPULARITY PREDICTION2.1 Problem FormulationTemporal Popularity Prediction (TPP) is a novel problem for socialmedia analyzing and learning [24]. With the temporal dynamics ofthe social multimedia system, the popularity of online posts usuallychanged over time. Influenced by the temporal characteristics [16,19] with complex contexts or patterns, how to predict accuratetemporal popularity become more challenging than before. Thetask of TPP is to estimate the future impacts of giving social mediaposts (photos, videos or news) at a specific time before they wereshared on the online platform. Specifically, given a new post vof a user u, predict popularity s describes how many attentionswould obtain if it was published at time t on social media. Theformulations of popularity can be defined as a score by differentdynamic indicators (e.g. views, likes or clicks, etc.) via diverse socialmultimedia platforms. In our challenge, we use “viewing count” as

Figure 3: An example of multi-level hierarchical photo cate-gories. It shows a 2nd -level category “animal” with 5 differ-ent 3rd -level categories.

a basic indicator of how popular a post is, while this is more general.So temporal popularity can be defined as the following:

Popularity Normalization. To suppress the large variationsamong different photos (e.g. view count of different photos varyfrom zero to millions), we implement a log function [26] to normal-ize the value of popularity, based on the previous work, as shownin Figure 2. In brief, the log-normalization function for popularitycan be defined as:

s = loд2r

d+ 1 (1)

where s is the normalized value, r is the view count of a photo, andd is the number of days since the photo was posted.

Particularly, the post sequence with time information for eachof user can be treated as time-series data. SMP Challenge 2019aimed to make time-series feeds for popularity prediction. Then,we defined sequence data with time orders:

User-Post Sequence. Suppose we have n user-photo pairs andthe sharing time of each pair. Then the user-post sequence can bedenoted by S = {(u1,v1), (u2,v2), ..., (un ,vn )} with its sharing timeorder t1 ≤ t2 ≤ ... ≤ tn .

3 SOCIAL MEDIA PREDICTION DATASETOVERVIEW

Social Media Prediction Dataset (SMPD) 1 is a large-scale bench-mark for social multimedia researches. We selected Flickr as thedata source of SMPD for multimedia and multi-modal data, whichis one of the largest photo-sharing websites with over 2 billionphotos monthly[18]. Different with single-task datasets, SMPD is amulti-faced data collection, which contains rich contextual informa-tion and annotations for multiple-tasks (such as user profile, postcategory, customize tag, geography information, photo image, andphoto metadata). The overview statistics of the dataset are shownin Table 1. It contains over 486K posts from 69K online users. Andeach of social media post has corresponding visual content andtextual content information (e.g. posted photos, photo categories,custom tags, temporal and geography information).

SMPDBuilding. To create a multi-faced dataset for social mediaresearch, we attempt to utilize a concept-based sampling method tocollect post data from the search engine of the Flickr platform. Theconcept-based approach aims to take a tag or concept as a searching

1http://smp-challenge.com/dataset

Table 1: Summary of SMPD Statistics.

Statistics ValueTrain Test

Number of Posts 3.05 × 105 1.81 × 105

Mean popularity of posts 6.41 5.12STD popularity of posts 2.47 2.41Number of users 3.8 × 104 3.1 × 104

Number of custom tags 2.5 × 105

Number of 1st level categories 11Number of 2nd level categories 77Number of 3rd level categories 668Temporal range of posts 480 daysAverage length of title 29 words

keyword, collects the posts that involved with the keyword. Onthis basis, a second selection will be manipulated to ensure the ac-curacy of concept-related posts [6]. The advantage of this approachis offering an accurate data source for theme extraction and featureextraction. Unlike traditional social bookmarking, Flickr does notinvolve creating an explicit vocabulary of tags to describe the post.Therefore, the referencing queries of different categories are filteredfrom the most popular tags which user liked most in 2015, suchas other tag prediction study [7]. We filtered the tags within in-complete or typo keywords, such as insta” or “instadog”, etc. Thenwe leave 756 categories within 11 topics for our dataset creation,as shown in Figure 4 and 5. To keep time-orders in our data, weobtained the public post stream continuously for each of the cate-gories in every day from Nov. 2015 to March. 2016. To have variousproperties in our dataset, we extracted abundant data includingvisual content (e.g. photo and photo categories), textual content (e.g.post title and custom tags) and Spatio-temporal content, revealingthe influence of region and time-zone on online social behavior.Finally, we have a large-scale multi-faced collection.

Figure 4: The statistics of posts in each 1st level category.It shows 11 1st level categories and the number of posts intrain and test data.

3.1 Visual ContentAs an old saying: “A picture is worth a thousand words", it is easierfor users to reflect their thoughts or emotions by photo/image onsocial media [2]. In our dataset, we collect 486k posts by querying

Figure 5: Overall distribution of popularity over days. Thediscrete black points represent the average popularity scoreper day. The blue line and shadow interval represent a re-gression result about the score. Specifically, the left part ofthe red line represents the training data and the right partrepresents the test data.

756 selected key-words (as mentioned in the prior section) withsocial media APIs [23]. These key-words can be organized into 11topics range from nature, people to animal (the directory tree inthe left of Figure 3). Furthermore, each of key-words representsan individual concept for photo content, such as “bird", “flower",etc. In the right part of Figure 3, an example category “Animal"with five different sub-concepts for content visualization is shown.The visual content (photo or image) of the posts with the samekey-word are similar in visual view. By utilized these categorizedphotos, it helps prepare train/test data for computer vision works.Furthermore, we generate 668 individual 3rd level human-craftcategories for photos of selected posts. By the 3 levels hierarchicalfine-grained classifying, SMPD provides fine-grained classes.

3.2 Spatio-temporal Content3.2.1 Time. Popularity prediction of social media posts is a time-sensitive task [3]. Temporal context of posts records user activities,and it is necessary to identify the uploading time. Meanwhile, Flickrprovides an uploading time for each submitted post. Figure 5 plotsthe average post counts and its uploaded months in our dataset.The posts show the most of posts in SMPD were uploaded betweenMarch 2015 to the creation time of the dataset in 2016. Among theupload dates, October and December become the popular monthfor sharing posts on the Flickr social media platforms. We attributethose improvements to the holidays and the end of the year.

3.2.2 Location. Location information provides user spatial distri-bution [28]. Not all posts have location information, but the locationinformation of photos point out the spatial region of user activi-ties. In SMPD, 32,068 posts have POI (Point of Interest) locationinformation with a geographic coordinate, either manually by theuser or automatically via GPS. Using this information, we wereable to map 10% of all items in the dataset to a single country orarea. Furthermore, SMPD provides geo accuracy value to representthe accuracy level of the location information, which ranges from

1 to 16 and represents from words level to street-level accuracy.To suppress the large variations among the number of posts indifferent territories, we implement a log function to normalize thenumber of posts in each country, based on the previous work. Thedistribution of all items over territories is shown in Figure 6.

Figure 6: The distribution of posts around theworld. The leg-end in the figure represents 5 equal interval ranges of Log-normalization value of posts′ number in each territory from0 to 9.28. The deeper the color is, the more posts are postedin the corresponding territory.

3.3 Textual ContentIn addition to visual content, we also collected the surrounding textof posts provides to show semantic information for each post. Asstatistics, there are more than 95% posts have relative descriptionsor titles. When uploading a photo on the social media platform, therelative textual content is appended to provide more details aboutthe photo content or publisher status.

Post Title. Each posted photo has a unique title named by theuser. As the saying goes: “There are a thousand Hamlets in a thou-sand people′s eyes", each title contains the explanation and under-standing of the photo. As shown in Table 1, users utilize average 29words to describe the content of uploaded photos, which not onlyhelps to analyze the visual content of the posts but also relate tothe popularity of the corresponding post.

Figure 7: The tag-cloud of 668 3rd -level keywords of photos.The larger the font size used, the corresponding tag is morefrequently used.

Post Tags. Most of social network sites provide hash-tags tomake user easier to find relevant post by topic with the same tag. Itis possible to label a single post with multiple tags. With this kindof flexibility, this method is easier than the traditional one-to-oneclassification. By counting the tag frequently used, we can havea glimpse of which topics are more popular within these socialnetwork sites. In Figure 7, we generated the “tag cloud" of posttags. The larger the font size used, the corresponding tag is more

frequently used. Based on the figure, the users of Flickr prefer toshare a holiday-related popular tag.

4 EVALUATIONTo measure the performance of temporal popularity predictionon time-series data, we adopt the time-related partition strategyto generate train/test sets for evaluation. In proposed time-seriesdata SMPD, we have a 10 length of the time window to build user-post sequences and divides the sequences to 2:1 for training andtest dataset. Specifically, the train and test sets also share similarnumbers of users.

By objective evaluation, we measure the performance of sub-mitted methods on the unpublished SMPD test set. Our evaluationprotocol is applied to the following criteria:

• Ranking Relevance: to measure the ordinal association be-tween ranked predicted popularity scores and actual ones.

• Prediction Error: to judge the error of the score prediction.As quantitative metrics of performance evaluation, we will computeSpearman Ranking Correlation (SRC, or Spearman′s Rho), MeanAbsolute Error (MAE) for each submitted model. SRC is a non-parametric measure of rank correlation, it applied to measure theranking correlation between ground-truth popularity set P and pre-dicted popularity set P̂ , varying from 0 to 1. If there are k samples,the SRC can be expressed as:

SRC =1

k − 1

k∑i=1

(Pi − P̄

σP

) (P̂i − ¯̂PσP̂

), (2)

where P̄ and σP are mean and variance of the corresponding popu-larity set. Furthermore, we also use Mean Absolute Error (MAE) tovalidate the prediction error. The goal of MAE is to calculate theaveraged prediction error:

MAE =1k

n∑i=1

| P̂i − Pi | . (3)

The ranking for the competition is based on an objective evalua-tion. Specifically, a rank list of teams is produced by sorting theirperformance on each of objective evaluation metrics, respectively.The final rank of a team is calculated by combining its two rankedmetrics for balance. The smaller the final ranking, the better theperformance.

5 CONCLUSIONSIn this paper, we have presented an overview of SMP Challenge2019 and proposed a large-scale social multimedia dataset for real-world prediction challenges. Meanwhile, we formulate the temporalpopularity prediction task, analyzes the proposed dataset and defineevaluation metrics. You can find more information about the task,dataset and challenge at SMP Challenge website 2.

ACKNOWLEDGMENTSWewould like to thankCAS-ICT,Microsoft ResearchAsia, AcademiaSinica for their helpful support. SMP Challenge 2019 was supportedby Columbia University. SMP Challenge 2017 and 2018 was spon-sored by Kwai Inc. and JD AI Research.2http://www.smp-challenge.com

REFERENCES[1] Nabiha Asghar. 2016. Yelp Dataset Challenge: Review Rating Prediction. CoRR

abs/1605.05362 (2016).[2] Spencer Cappallo, Thomas Mensink, and Cees GM Snoek. 2015. Latent Factors

of Visual Popularity Prediction. In Proceedings of ACM International Conferenceon Multimedia Retrieval.

[3] Nathan Ellering. 2016. What 16 Studies Say About The Best Times To Post OnSocial Media. http://coschedule.com/blog/best-times-to-post-on-social-media/.(2016). [Online].

[4] Emilio Ferrara, Roberto Interdonato, and Andrea Tagarelli. 2014. Online Popular-ity and Topical Interests Through the Lens of Instagram. In Proceedings of the25th ACM Conference on Hypertext and Social Media (HT ’14). 24–34.

[5] Shintami C Hidayati, Kai-Lung Hua, Wen-Huang Cheng, and Shih-Wei Sun. 2014.What are the Fashion Trends in New York. In Proceedings of ACM InternationalConference on Multimedia (ACM MM).

[6] Mark J. Huiskes and Michael S. Lew. 2008. The MIR Flickr Retrieval Evaluation.In Proceedings of the 1st ACM International Conference on Multimedia InformationRetrieval (MIR ’08). 39–43.

[7] Jin Yea Jang, Kyungsik Han, Patrick C Shih, and Dongwon Lee. 2015. Generationlike: Comparative characteristics in instagram. In Proceedings of the 33rd AnnualACM Conference on Human Factors in Computing Systems. 4039–4042.

[8] Sebastian Kalkowski, Christian Schulze, Andreas Dengel, and Damian Borth. 2015.Real-time Analysis and Visualization of the YFCC100M Dataset. In Proceedings ofthe 2015 Workshop on Community-Organized Multimodal Mining: Opportunitiesfor Novel Solutions (MMCommons ’15). 25–30.

[9] Aditya Khosla, Atish Das Sarma, and Raffay Hamid. 2014. What Makes an Image Popular?. In Proceedings of International World Wide Web Conference (WWW).

[10] Ryota Kobayashi and Renaud Lambiotte. 2016. TiDeH Time-Dependent Hawkesprocess for predicting retweet dynamics. In arXiv preprint arXiv:1603.09449.

[11] Shoubin Kong, Qiaozhu Mei, Ling Feng, Fei Ye, and Zhe Zhao. 2014. Predictingbursts and popularity of hashtags in real-time. In Proceedings of the 37th interna-tional ACM SIGIR Conference on Research & development in Information Retrieval.927–930.

[12] Ranjay Krishna, Yuke Zhu, Oliver Groth, Justin Johnson, Kenji Hata, JoshuaKravitz, Stephanie Chen, Yannis Kalantidis, Li-Jia Li, David A. Shamma, Michael S.Bernstein, and Fei-Fei Li. 2016. Visual Genome: Connecting Language and VisionUsing Crowdsourced Dense Image Annotations. CoRR abs/1602.07332 (2016).

[13] Cheng Li, Yue Lu, Qiaozhu Mei, Dong Wang, and Sandeep Pandey. 2015. Click-through Prediction for Advertising in Twitter Timeline. In Proceedings of KDD.

[14] Ling Lo, Chia-Lin Liu, Rong-An Lin, Bo Wu, and Wen-Huang Cheng. 2019. Dress-ing for Attention: Outfit Based Fashion Popularity Prediction. In IEEE Interna-tional Conference on Image Processing (ICIP).

[15] Travis Martin, Jake M Hofman, Amit Sharma, Ashton Anderson, and Duncan JWatts. 2016. Exploring limits to prediction in complex social systems. In Proceed-ings of International World Wide Web Conference (WWW).

[16] Joseph E. McGrath and Janice R. Kelly. 1992. Temporal Context and TemporalPatterning. Time & Society 1, 3 (1992), 399–420.

[17] Tao Mei, Bo Yang, Xian-Sheng Hua, and Shipeng Li. 2011. Contextual videorecommendation by multimodal relevance and user feedback. ACM Transactionson Information Systems (2011).

[18] Franck Michel. 2019. How many photos are uploaded monthly to Flickr?http://www.flickr.com/photos/franckmichel/6855169886. (2019). [Online].

[19] Seth A Myers and Jure Leskovec. 2014. The bursty dynamics of the twitterinformation network. In Proceedings of the 23rd International Conference on WorldWide Web. 913–924.

[20] Xueming Qian, He Feng, Guoshuai Zhao, and Tao Mei. 2014. PersonalizedRecommendation Combining User Interest and Social Circle. IEEE Transactionson Knowledge and Data Engineering 26, 7 (2014), 1763–1777.

[21] Benjamin Shulman, Amit Sharma, and Dan Cosley. 2016. Predictability of Pop-ularity: Gaps between Prediction and Understanding. In Proceedings of AAAIConference on Artificial Intelligence.

[22] Gabor Szabo and Bernardo A Huberman. 2010. Predicting the popularity ofonline content. Commun. ACM 53, 8 (2010), 80–88.

[23] Flickr team. 2019. Flickr API. https://www.flickr.com/services/api/. (2019). [On-line].

[24] Bo Wu, Wen-Huang Cheng, Yongdong Zhang, and Tao Mei. 2016. Time Matters:Multi-scale Temporalization of Social Media Popularity. In Proceedings of the 2016ACM on Multimedia Conference (ACM MM).

[25] Bo Wu, Wen-Huang Cheng, Yongdong Zhang, Huang Qiushi, Li Jintao, and TaoMei. 2017. Sequential Prediction of Social Media Popularity with Deep TemporalContext Networks. In International Joint Conference on Artificial Intelligence(IJCAI).

[26] Bo Wu, Tao Mei, Wen-Huang Cheng, and Yongdong Zhang. 2016. UnfoldingTemporal Dynamics: Predicting Social Media Popularity Using Multi-scale Tem-poral Decomposition. In Proceedings of the Thirtieth AAAI Conference on ArtificialIntelligence (AAAI).

[27] Chun-Che Wu, Tao Mei, Winston H Hsu, and Yong Rui. 2014. Learning topersonalize trending image search suggestion. In Proceedings of InternationalACM SIGIR Conference on Research and Development in Information Retrieval.727–736.

[28] Dingqi Yang, Daqing Zhang, and Bingqing Qu. 2016. Participatory CulturalMapping Based on Collective Behavior Data in Location-Based Social Networks.ACM Transactions on Intelligent Systems and Technology 7, 3, Article 30 (Jan. 2016),30:1–30:23 pages.

[29] Jaewon Yang and Jure Leskovec. 2011. Patterns of Temporal Variation in OnlineMedia. In Proceedings of ACM International Conference on Web Search and DataMining (WSDM).

[30] Qingyuan Zhao, Murat A Erdogdu, Hera Y He, Anand Rajaraman, and JureLeskovec. 2015. Seismic: A self-exciting point process model for predicting tweetpopularity. In Proceedings of ACM SIGKDD Conference on Knowledge Discoveryand Data Mining. 1513–1522.

Documents

SMP Challenge: An Overview of Social Media Prediction ... · 3 SOCIAL MEDIA PREDICTION DATASET OVERVIEW Social Media Prediction Dataset (SMPD) 1 is a large-scale bench-mark for social