98
Understanding User Generated Data Wuhan University Tieyun Qian School of Computer Science, Wuhan University Email: [email protected]

Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

  • Upload
    others

  • View
    2

  • Download
    0

Embed Size (px)

Citation preview

Page 1: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Understanding User Generated Data

Wuhan University

Tieyun QianSchool of Computer Science, Wuhan University

Email: [email protected]

Page 2: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Outline

User Generated Data

An overview on Sentiment Analysis

An overview on Recommender Systems

Our Work

Page 3: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Outline

User Generated Data

An overview on Sentiment Analysis

An overview on Recommender Systems

Our Work

Page 4: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

User generated content: explicit

Texts

Pictures

Videos

……

User Generated Data

User generated data: both explicit and implicit

Behaviors

Interactions

Users’ relations

……

Web2.0: User receiving data

User generating data.

Page 5: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

In this talk, I will focus on:

Explicit texts

Sentiment analysis

User Generated Data

I will also focus on:

Implicit interactions

Recommender systems

Page 6: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

For sellers: sales promotion, and quality improvement

For buyers: comparison and analysis before making decision

For government: public sentiment monitoring

For marketing: financial analysis

…………

Understanding User Generated Data

It is important to understand user generated data!

Page 7: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Outline

User Generated Data

An overview on Sentiment Analysis

An overview on Recommender Systems

Our Work

Page 8: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

An overview on Sentiment Analysis

Granularity

document

sentence

aspect

Application:more detailed and through analysis

Technique:model the semantic relation between the aspect and sentence

SA:automatically detect the polarity of the texts

Page 9: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Two types of scenarios:

Aspect term sentiment analysis (ATSA)

to predict the polarity of an entity in the sentence

Aspect category sentiment analysis (ACSA)

to predict the polarity of an predefined aspect category

A mix of students and area residents crowd into this narrow, barely

there space for its quick, tasty treats at dirt-cheap prices.

ATSA

space: negative, treats:positive, prices: positive

ACSA

food: positive, price: positive, ambience: negative

An overview on Sentiment Analysis

Page 10: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Main Problems in ABSA

Data labeling is extremely expensive in ABSA

Hard to train high-quality embeddings for the terms with low frequency

Hard to locate the aspect category in the sentence

Page 11: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Outline

User Generated Data

An overview on Sentiment Analysis

An overview on Recommender Systems

Our Work

Page 12: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

An overview on Recommender Systems

Traditional recommender systems

Sequential recommender systems (SRS) √

Session based sequential recommender systems √

Page 13: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Main Problems in SRS

Modeling long term interests

Modeling short term interests

Dealing with cold start issues

Page 14: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Outline

User Generated Data

An overview on Sentiment Analysis

An overview on Recommender Systems

Our Work

Page 15: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Our Work

Sentiment analysis

Recommender systems

User profiling

Representation learning

Page 16: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

16

Transfer Capsule Network

for Aspect Level Sentiment Classification

Zhuang Chen, Tieyun Qian∗School of Computer Science, WuHan University, China

{zhchen18, qty}@whu.edu.cn

Published at ACL 2019 (Research Long Paper)

Page 17: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 17

TransCap - Motivations(1/17)

► The lack of labeled data is a major obstacle in ASC. Publicly available datasets

for ASC often contain limited number of training samples.

► Document-level labeled data like reviews are easily accessible from online

websites such as Yelp and Amazon. The accompanying rating scores can

naturally serve as the sentiment labels.

► The document-level data contain useful sentiment knowledge for analysis on

aspect level data since they may share many linguistic and semantic patterns.

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 18: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 18

TransCap - Architecture(2/17)

Definitions

► TA : Aspect-level sentiment classification (ASC).

► TD : Document-level sentiment classification (DSC).

► TransCap: Improve TA with transferred knowledge from TD.

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 19: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 19

TransCap - Architecture(3/17)

Preliminary

► CapsNet : CapsNet is first proposed for image classification in computer vision.

► Applied to NLP tasks : text classification, relation extraction...

► Why we use capsules : encapsulated features, separate classes, dynamic

routing.

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 20: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 20

TransCap - Architecture(4/17)

TransCap Overview

(1) (2) (3) (4)

shared non-shared

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 21: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 21

TransCap - Architecture(5/17)

(1) Input Layer (shared)

► Look-up layer

► Position Information

► Sentence Embedding

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 22: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 22

TransCap - Architecture(6/17)

(2) Feature Capsule Layer (shared)

► Filter Group

► Multiple Convolution Operations

► Generated Feature Capsules (1 category of semantic meaning)

► Repeat for Multiple Channels (C categories of semantic meaning)

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 23: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 23

TransCap - Architecture(7/17)

(3) Semantic Capsule Layer (shared)

► An Additional Convolution

► Aspect Routing

► Generated Semantic Capsules (C categories of semantic meaning)

weight

routing

condensation

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 24: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

24

TransCap - Architecture(8/17)

(4) Class Capsule Layer (non-shared)

► Separate Class Capsules

6 capsules for 2 tasks & 3 polarities

► Generated Class Capsules

► Dynamic Routing

prediction vector

weighted sum

output vector

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 25: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

25

TransCap - Architecture(9/17)

Training Procedure

► Data from TD and TA take turns to train TransCap model.

► Layer(1)(2)(3) transfer knowledge from TD to TA, layer(4) avoids mutual

disturbance.

► Loss Functioneach polartity

single task

final loss

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 26: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

26

TransCap - Experiments (10/17)

Datasets for TA

► SemEval 2014 Task 4 : Restaurant & Laptop

► 80% for training, 20% for development

Datasets for TD

► Yelp, Amazon and Twitter

► 30,000 balanced samples, all for training

► <3 : negative, =3 : neutral, >3 : positive (Yelp, Amazon)

Dataset Combinations

► {Y,A} : {Restaurant+Yelp, Laptop+Amazon} Relevant but Imprecise

► {T,T} : {Restaurant+Twitter, Laptop+Twitter} Precise but Irrelevant

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 27: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

27

TransCap - Experiments (11/17)

Results

► Averaged results over 5 runs with

random initialization.

► TransCap outperforms all baselines.

► {Y,A} and {T,T} achieve similar results.

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 28: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

28

TransCap - Experiments (12/17)

Ablation Study

► “- A”: remove Aspect routing

► “- S”: remove Semantic capsules

► “- D”: remove Dynamic routing

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 29: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

29

TransCap - Experiments (13/17)

Parameter Analysis

► Influence of Auxiliary Corpus Size

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 30: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

30

TransCap - Experiments (14/17)

Parameter Analysis

► Effects of Balance Factor γ

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 31: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

31

TransCap - Experiments (15/17)

Case Study

► Part 1 : What does TransCap transfer?

test sample (TransCap √ Others ×)

1.“ It has so much more speed and the [screen]pos is very sharp.”

“sharp” is a multi-polarity word

2.“ Once open, the [leading edge]neg is razor sharp.”3.“ [Graphics]pos are clean and sharp, internet interfaces are seamless.”

- The training set in Laptop contains only 8 samples including “sharp” with 5 of

them are labeled as negative.

- Amazon dataset contains 294 samples where “sharp” co-occurs with lots of

different contexts.

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 32: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

32

TransCap - Experiments (16/17)

Case Study

► Part 2 : How does TransCap make decisions?

test sample

1.“ Great [food]pos but the [service]neg is dreadful !”

Figure. Visualization of coupling coefficients cij after dynamic routing.

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 33: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

33

TransCap - Experiments (17/17)

Case Study

► Part 3 : Can TransCap handle complicated patterns?

test sample (TransCap √ Others ×)

1.“ The [staff]neg should be a bit more friendly.”

- An euphemistic negative review towards the aspect [staff].

- TransCap generates and transfers sentence-level knowledge.

auxiliary sample from Yelp

2.“ The pro-shop staff should be more polite when answering the phone...]neg”

Transfer Capsule Network for Aspect Level Sentiment Classification (ACL2019)

Page 34: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

Aspect Aware Learning for Aspect Category Sentiment Analysis

(TKDD19) 34

Aspect Aware Learning

for Aspect Category Sentiment Analysis

Peisong Zhu, Zhuang Chen, Haojie Zheng, Tieyun Qian∗School of Computer Science, WuHan University, China{zhups24, zhchen18, zhenghaojie, qty}@whu.edu.cn

Published at TKDD 2019

Page 35: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

35

AAL - Motivations(1/11)

► For category-based ASC, 1) the first

challenge is to locate the exact position of

the aspect category, 2) and the second is to

correlate opinion words with different

categories in one sentence.

► Categories have specific opinion words.

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

great 0.04food 0.03but 0.01the 0.01service 0.01was 0.01

dreadful 0.80! 0.09

great 0.07food 0.04but 0.02the 0.01service 0.01was 0.02

dreadful 0.72

! 0.11

Category: service

Label: Negative

Prediction:Negative

Category: food

Label: Positive

Prediction:Negative

delicious → food

expensive → price

quiet → ambience

The fish is delicious.

Great food but the service is dreadful.

Page 36: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 36

AAL - Architecture(2/11)

AAL Overview

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

We introduce additional supervision

(Aspect-aware learning, AAL) to help

correlate contexts with given aspect

categories.

► AAL-LEX at word level

► AAL-SS at sentence level

Page 37: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 37

AAL - Architecture(3/11)

(1) Input Layer ► Look-up layer initialized by Glove

► Bi-LSTM

(2) Task 1 - Sentiment Prediction

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

stack to multiple layers...

► Attention

► Aggregation

► Prediction

► Task1 loss

Page 38: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 38

AAL - Architecture(4/11)

(3) Task 2 - Category Prediction

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

AAL-LEX at word level

► Word-cate PMI

► Co-occurrence prob

► Normalization

► Task2 loss

► Final loss

Page 39: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 39

AAL - Architecture(5/11)

(3) Task 2 - Category Prediction

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

AAL-SS at sentence level

► Attention&aggregation

► Category prob

► Task2 loss

► Final loss

Page 40: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

40

AAL - Experiments (6/11)

Datasets

SemEval 2014, SemEval 2015, SemEval 2016

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

Page 41: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

41

AAL - Experiments (7/11)

Results

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

Page 42: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

42

AAL - Experiments (8/11)

Parameter Analysis

► Effects of Balance Factor

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

Page 43: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

43

AAL - Experiments (9/11)

Parameter Analysis

► Effects of Layer Number

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

Page 44: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

44

AAL - Experiments (10/11)

Case Study

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

► Aspect Lexicon

Page 45: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

45

AAL - Experiments (11/11)

Case Study

Aspect Aware Learning for Aspect Category Sentiment Analysis (TKDD19)

► Attention Visualization

Example:

Do n't be put off by another reviewer's

extremely negative reviews!

Page 46: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

Enhanced Aspect Level Sentiment Classification with Auxiliary

Memory (COLING18) 46

Enhanced Aspect Level Sentiment

Classification

with Auxiliary Memory

Peisong Zhu, Tieyun Qian∗School of Computer Science, WuHan University, China

{zhups24, qty}@whu.edu.cn

Published at COLING 2018 (Research Long Paper)

Page 47: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

47

DAuM - Motivations(1/9)

► For term-based ASC, it's hard to train high-quality embeddings for the terms with

low frequency.

► For category-based ASC, categories do not explicitly occur in sentences, thus it is

hard for a model to capture aspect related context words from the sentence.

► The terms and categories in ASC are closely related to each other. Existing

methods fail to utilize the relevance between them because only very few

datasets are annotated with both labels.

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

Page 48: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 48

DAuM - Architecture(2/9)

DAuM Overview

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

Deep memory network with an Auxiliary Memory

► Auxiliary Memory : Generate corresponding

category embeddings for given terms, vice

versa.

► Sentiment Memory : Both original and

generated embeddings are fed into sentiment

memory to collect relative information from

context.

Page 49: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

(1) Auxiliary Memory (e.g. term => category)

(2) Sentiment Memory

► Initialized memory:

► Generate category

embeddings:

2019/12/15 49

DAuM - Architecture(3/9)

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

stack to multiple layers...

► Term attention

► Category attention

► Aggregation

Page 50: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

50

DAuM - Architecture(4/9)

Loss Function

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

► Sentiment Prediction

► Semantic Relatedness ( term & category) Regularization

► Category Embedding Regularization

Page 51: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

51

DAuM - Experiments (5/9)

Datasets

SemEval 2014, SemEval 2016

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

Category-based

Term-based

Page 52: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

52

DAuM - Experiments (6/9)

Results

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

Category-based

Term-based

Page 53: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

53

DAuM - Experiments (7/9)

Parameter Analysis

► Effects of Multiple Layers

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

Page 54: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

54

DAuM - Experiments (8/9)

Parameter Analysis

► Effects of Aspect Number k

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

Page 55: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15

55

DAuM - Experiments (9/9)

Case Study

Enhanced Aspect Level Sentiment Classification with Auxiliary Memory (COLING18)

from “food” from “service” from “price”

Page 56: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Outline

User Generated Data

An overview on Sentiment Analysis

An overview on Recommender Systems

Our Work

Page 57: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Our Work

Sentiment analysis

Recommender systems

User profiling

Representation learning

Page 58: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

58

Spatiotemporal Representation Learning for Translation-Based

POI Recommendation

Tieyun Qian1, Bei Liu1, Quoc Viet Hung Nguyen2, Hongzhi Yin3,

1Wuhan University, 2Griffith University,3The University of Queensland

Published at TOIS 2019 (Regular Research Paper)

Page 59: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

59

Problems in POI recommendation (1/12)

• Highly spatio- and temporal- sensitive

• Cold start

Page 60: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

We adopt the translation-based knowledge graphembedding techniques to model the spatiotemporal effectsin POI recommendation. The joint modeling ofspatiotemporal information also distinguishes our work fromexisting studies which consider these information in aseparate way.

We propose a new type of cold-start problem, cold-startspatiotemporal contexts, and develop effective methods toexploit and integrate various correlation information intothe representation learning of cold-start users, items andspatiotemporal contexts to address cold start problems

STL – Main Contributions (2/12)

Page 61: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 61

STL – Proposed Model (3/12)

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

We represent the spatio-temporal context < t , l > as a type of relation such that it captures the user’s check-in behaviors at the specific time and location

A user u will reach an interested POI vq via a translation edge tl

Basic idea:

Page 62: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 62

STL – Proposed Model (4/12)

The cold-start spatial-temporal contexts refer to those new time-location pairs that

have never appeared in the training dataset. Almost all individual spatial and

temporal contexts are not new although their combinations are cold start.

We leverage the contexts for finding its nearest and farthest neighbors based on their spatial and temporal similarity.

Dealing with cold start check-ins:

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

Page 63: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 63

STL – Proposed Model (5/12)

We propose two geo-social correlation measures to incorporate a user’s socialand geographical contexts.

Dealing with cold start users

We employ the normalized ratio of common friends in two users’ social circles as the similarity metric of social influence.

The geographical similarity simgeo is similar to the definition in siml.

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

Page 64: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

2019/12/15 64

STL – Proposed Model (6/12)

We propose two geo-semantic correlation measures to incorporate the semantic and geographical contexts of a POI.

Dealing with cold start POIs:

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

The semantic similarity simsem is defined as the Jaccard coefficientbetween the tag set T (pq ) and T (pk ) for POI pq and pk

Page 65: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

We evaluate our methods on two real-life LBSN datasets: Foursquare and Gowalla

STL – Evaluation (7/12)

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

Page 66: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

We compare our STA with 10 POI recommendation. They represent the state-of-the-art methods:

firstly, they cover four types of popular recommendation techniques, i.e., collaborative filtering, matrix factorization, distributed representation, and hybrid model;

secondly, they consider six important factors that influence user decision-making for choosing POIs, including user preference, temporal, geographical, social, content, and sequential influence.

STL – Evaluation (8/12)

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

Page 67: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Comparison Results

STL – Evaluation (9/12)

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

Page 68: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Comparison Results

STL – Evaluation (10/12)

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

Page 69: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Sensitivity to data sparsity

STL – Evaluation (11/12)

Page 70: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Results in cold-start scenario

STL – Evaluation (12/12)

Spatiotemporal Representation Learning for Translation-Based POI Recommendation (TOIS 2019)

Page 71: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

71

What Can History Tell Us? Identifying Relevant Sessions for

Next-Item Recommendation

Ke Sun1, Tieyun Qian1, Hongzhi Yin2, Tong Chen2, Yiqi Chen1, Ling Chen3

1Wuhan University , 2University of Queensland,3University of Technology, Sydney

Published at CIKM 2019 (Research Long Paper)

Page 72: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

72

Background

• Sequential recommendation

Modelling sequential dependencies of user-item

interactions.

• Session-based recommendation

A subtask of sequential recommendation.

User transaction sequence is partitioned into

sessions.

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 73: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

73

Motivation

• Limitations of existing sequential methods:

Ignore long-term user preferences. (Fig. a)

Consider all historical sessions without any distinction.

(Fig. b)

a.

b.

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 74: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

74

Our Contributions

• We propose a novel deep learning based sequential

recommender framework for session-based

recommendation by integrating both long-term and

short-term user preferences in a unified way.

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 75: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

75

Problem and Preliminary

• Problem Definition: Given a historical session

sequence {𝑆1, … , 𝑆𝑛−1} and current session 𝑆𝑛 =

𝑖1, … , 𝑖𝑡−1 of u, predict 𝑖𝑡.

• Preliminary: nonlocal structure

normalization factor similarity scalar

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 76: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

76

Architecture

We design a two-layer nonlocal architecture to learn long-term

user preferences from relevant historical sessions

We integrate the nonlocal structure with a gated recurrent unit

(GRU) to learn short-term user preferences.

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 77: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

77

Proposed Method

• Long-term preference modeling

two-layer nonlocal neural network

session-level:

Page 78: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

78

Proposed Method

• Long-term preference modeling

two-layer nonlocal neural network

session-level:

Page 79: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

79

Proposed Method

• Long-term preference modeling

two-layer nonlocal neural network

item-level:

Page 80: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

80

Proposed Method

• Long-term preference modeling

two-layer nonlocal neural network

item-level:

Page 81: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

81

Proposed Method

• Short-term preference modeling

GRU and nonlocal neural network

Page 82: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

82

Proposed Method

• Short-term preference modeling

GRU and nonlocal neural network

Page 83: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

83

Proposed Method

• Personalized Integration

historical representation

current representation

Page 84: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

84

Proposed Method

• Personalized Integration

overall representation

Page 85: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

85

Proposed Method

• Personalized Integration

overall representation

Page 86: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

86

Proposed Method

• Prediction

personalized user preference

Page 87: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

87

Proposed Method

• Prediction

personalized user preference

Page 88: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

88

• Dataset

Tmall (an E-commerce dataset)

Gowalla (a Point-Of-Interest recommendation dataset)

• Evaluation

Recall

MRR

Evaluation

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 89: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

89

• Baseline methods

POP

Fossil [He et al, ICDM2016]

GRU4Rec [Hidasi et al, ICLR2015]

NARM [Li et al, CIKM2017]

STAMP [Liu et al, KDD2018]

HRNN [Quadrana et al, RecSys2017]

SHAN [Ying et al, IJCAI2018]

BINN [Li et al, KDD2018]

Evaluation

Consider short-term

preference only

Consider both long

and short-term

preferences

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 90: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

90

• Overall Performance

Short-term preference is more important in Tmall dataset.

Long-term preference is more important in Gowalla dataset.

Evaluation

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 91: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

91

• Impact of Different User History Lengths

Evaluation

Page 92: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

92

• Analysis on Different Model Components

Remove-LT: No long-term preference

Remove-ST: No short-term preference

Remove-PI: No personalized integration

Evaluation

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 93: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

93

• Analysis on Hyperparameter λ

Tmall: performance decreases when the value of λ

grows.

Gowalla: performance increases when the value of λ

grows.

Evaluation

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 94: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

94

• We design a two-layer nonlocal neural network to

precisely capture user’s long-term preferences.

• We deploy the GRU network coupled with a nonlocal

structure to model short-term preferences.

• We present a personalized strategy to adaptively

combine the learned long and short-term preferences.

Conclusion

What Can History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation (CIKM 2019)

Page 95: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Our work on sentiment analysis

Zhuang Chen, Tieyun Qian. Transfer Capsule Network for Aspect Level

Sentiment Classification. ACL, pp 547-556, 2019.

Peisong Zhu, Zhuang Chen, Haojie Zheng, Tieyun Qian. Aspect Aware Learning

for Aspect Category Sentiment Analysis. ACM Transactions on Knowledge

Discovery from Data (TKDD), 13 (6) , 2019.

Peisong Zhu,Tieyun Qian. Enhanced Aspect Level Sentiment Classification with

Auxiliary Memory. COLING. 2018, pp 1077-1088.

Yunkai Yang, Tieyun Qian, Zhuang Chen. Aspect-Level Sentiment Classification

with Dependency Rules and Dual Attention. ICONIP (2) 2019: 643-655.

Page 96: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Our work on recommender systems

Ke Sun, Tieyun Qian, Tong Chen, Yile Liang, Quoc Viet Hung Nguyen, Hongzhi

Yin. Where to Go Next: Modeling Long- and Short-Term User Preferences for

Point-of-Interest Recommendation. AAAI 2020, accepted.

Tieyun Qian, Bei Liu, Quoc Viet Hung Nguyen, and Hongzhi Yin. 2019.

Spatiotemporal Representation Learning for Translation-Based POI

Recommendation. ACM Trans. Inf. Syst. (TOIS) 37, 2, Article 18, 2019

Ke Sun, Tieyun Qian, Hongzhi Yin, Tong Chen, Yiqi Chen, Ling Chen. What Can

History Tell Us? Identifying Relevant Sessions for Next-Item Recommendation.

CIKM, 2019: 1593-1602.

Yile Liang, Tieyun Qian, Huilin Yu. Align Reviews with Topics for Rating

Prediction. DASFAA 2019, DASFAA (3) 2019: 249-253

Page 97: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Sourcecode available at:

https://github.com/NLPWM-WHU

Page 98: Wuhan University Understanding User Generated Dataws.nju.edu.cn/conf/keqa2019/resources/钱铁云... · 2019. 12. 17. · Understanding User Generated Data Wuhan University Tieyun

Thank you!

Any questions?