KDD’14 Debrief 24 th April - 27 st, 2014 24 th April - 27 st August, 2014 New York City, US WING...

KDD14 Debrief 24 th April - 27 st, 2014 24 th April - 27 st August, 2014 New York City, US WING Monthly Meeting (Oct 24, 2014) Presented by Xiangnan He

Open Ceremony 2

Welcome Words Donot spend your precious time asking Why isnt the world a better place? It will only be time wasted. The question to ask is How can I make it better? To that there is an answer. --- --- Leo Buscaglia

Overview The largest KDD conference ever. The largest KDD conference ever. Number of attendees: 2200 + (last year is 1176). 151 Research papers (20% growth over KDD13), a 43 industry & govt. papers (30% growth) 26 workshops (75% growth) 12 tutorials (100% growth) Whats new? Whats new? Paper spotlights every morning (1 min/paper) All papers are required to have a poster presented. Networking Session: Building a Career in Data Science

Research Track

Reviewing Process

Submissions per Country

Acceptance by Subject Area

Predicting Paper Acceptance

Academia VS. Industry

Review Statistics

Research Topics Some technical topics that I found especially notable/popular include: Topic/Graphical modeling (not only for text mining, many tasks are addressed with this method) Deep Learning (2 tutorials, but no full papers) Social Networks and graph analytics (popular for the last 10 years, and even more so this year) Recommendations Workforce analytics

Best Paper Awards Best paper: Best paper: Reducing the Sampling Complexity of Topic Models. Aaron Q Li, Carnegie Mellon University; Amr Ahmed, Sujith Ravi, Alexander J Smola, Google. Best student paper: Best student paper: An Efficient Algorithm For Weak Hierarchical Lasso Yashu Liu, Jie Wang, Jieping Ye, Arizona State University,Arizona State University.

Test of Time Award Integrating Classification and Association Rule Mining [KDD 1998], cited by over 2000 times.

Some interesting papers Mining Topics in Documents: Standing on the Shoulders of Big Data. Mining Topics in Documents: Standing on the Shoulders of Big Data. Zhiyuan Chen, Bing Liu; University of Illinois at Chicago; Matching Users and Items Across Domains to Improve the Recommendation Quality. Matching Users and Items Across Domains to Improve the Recommendation Quality. Chung-Yi Li,Shou-De Lin; National Taiwan University FoodSIS: A Text Mining System to Improve the State of Food Safety in Singapore Kiran Kate, Sneha Chaudhari, Andy Prapanca, Jayant Kalagnanam; IBM Research;

Mining Topics in Documents: Standing on the Shoulders of Big Data. Mining Topics in Documents: Standing on the Shoulders of Big Data. Zhiyuan Chen, Bing Liu; University of Illinois at Chicago; Proposed a variant of topic model that can generate more accurate and coherent topics via integrating knowledge. Proposed a variant of topic model that can generate more accurate and coherent topics via integrating knowledge. 2 kinds of Knowledge: 2 kinds of Knowledge: Must-links, e.g., Must-links, e.g., Cannot-links, e.g., Cannot-links, e.g., Knowledge are mined through frequent itemset mining. Knowledge are mined through frequent itemset mining. But knowledge can be wrong, authors further propose some rules to clean up the knowledge. But knowledge can be wrong, authors further propose some rules to clean up the knowledge. Knowledge can be easily integrated the into the inference algorithm with generalized Polya Urn Model. Knowledge can be easily integrated the into the inference algorithm with generalized Polya Urn Model.

Innovation Award Talk Principles of Very Large Scale Modeling Principles of Very Large Scale Modeling by by Pedro Domingos, from University of Washington. Three principles: Three principles: 1. Model the whole, not just parts; 1. Model the whole, not just parts; People (customers) influence each other - model the whole network, not each person separately. 2. Tame complexity via hierarchical decomposition; 2. Tame complexity via hierarchical decomposition; We can make 2 assumptions: 1) Subparts are independent given the part; 2) Probability for class is the avg over subclasses. Using hierarchy and 2 previous assumptions makes our inference tractable. Example: Markov Logic Network + Sum-Product Theorem = Tractable Markov Log 3. Time and space should not depend on data size. 3. Time and space should not depend on data size.

THANK YOU! Video recordings of KDD: http://videolectures.net/kdd2014_newyork/ http://videolectures.net/kdd2014_newyork/

KDD’14 Debrief 24 th April - 27 st, 2014 24 th April - 27 st August, 2014 New York City, US WING...

Documents

Budapest, 24 th March 2004

24 th – 27 th October 2014 bramstokerstokerfestival #BiteMeDublin

Agenda- January 24 th , 2014

24 th September 2014

Thursday 24 th January - morning

24 th July 2014

Perdirjen 24 Th 2015

February 24 th , 2012

May 24 th , 2005

April 24 th , 2009

ISTSS 24 th Annual Meeting

th 24 th International Conference 24 International

April 24 th Agenda

Event, March 19-24 th 2013 Pressure. Event, March 19-24 th 2013 Temperature

24-27 th June, 2014

BCCPA 38 th Annual Conference May 24 th – 26 th, 2015 Whistler, BC

Date:Thursday, October 24 th , 2013

Dongqing Xiao, Mohamed Y. Eltabakh, Xiangnan Kong

Date: 2014/05/27 Author: Xiangnan Kong , Bokai Cao , Philip S. Yu Source: KDD’13

Module TH 24 - quangdien.thuathienhue.edu.vn