20
Author: Sinno Jialin Pan, Xiaochuan Ni, Jian-Tao Sun, QiangYang, Zheng Chen IW3C2, WWW2010 Presenter: Rei-Zhe Liu, 5/25 Cross-Domain Sentiment Classification via Spectral Feature Alignment

Cross domain sentiment classification via spectral feature alignment

  • Upload
    lau

  • View
    1.398

  • Download
    2

Embed Size (px)

DESCRIPTION

Cross domain sentiment classification via spectral feature alignment

Citation preview

Page 1: Cross domain sentiment classification via spectral feature alignment

Author: Sinno Jialin Pan, Xiaochuan Ni, Jian-Tao Sun, QiangYang, Zheng Chen

IW3C2, WWW2010

Presenter: Rei-Zhe Liu, 5/25

Cross-Domain Sentiment Classification

via Spectral Feature Alignment

Page 2: Cross domain sentiment classification via spectral feature alignment

Outline

Introduction

Problem setting

Spectral domain-specific feature alignment

Experiments

Conclusion

Page 3: Cross domain sentiment classification via spectral feature alignment

Introduction(1/1)

In this paper, we target at finding an effective approach for

the cross-domain sentiment classification problem.

We propose a spectral feature alignment algorithm to find a

new representation for cross-domain sentiment data.

Construct a bipartite graph to model the co-occurrence

relationship between domain-specific words and domain-

independent words.

Page 4: Cross domain sentiment classification via spectral feature alignment

Problem setting(1/3)

Page 5: Cross domain sentiment classification via spectral feature alignment

Problem setting(2/3)

Page 6: Cross domain sentiment classification via spectral feature alignment

Problem setting(3/3)

The problem is how to construct such an ideal representation

as shown in Table 3.

Using domain-independent words

Page 7: Cross domain sentiment classification via spectral feature alignment

Spectral domain-specific feature

alignment

Page 8: Cross domain sentiment classification via spectral feature alignment

Domain-independent feature

selection(1/1)

Our strategy is to select domain-independent features based

on their frequency in both domains.

Given the number l of domain-independent features to be

selected, we choose features that occur more than k times in

both the source and target domains.

k is set to be the largest number such that we get at least l

such features.

Page 9: Cross domain sentiment classification via spectral feature alignment

Bipartite feature graph

construction(1/3)

We set the window size to be the maximum length of all

documents.

We want to show that by construction a simple bipartite

graph and adapting spectral clustering techniques on it, we

can relate domain-specific features effectively.

Page 10: Cross domain sentiment classification via spectral feature alignment

Bipartite feature graph

construction(2/3)

Page 11: Cross domain sentiment classification via spectral feature alignment

Bipartite feature graph

construction(3/3)

They tend to be very related and will be aligned to a same

cluster with high probability,

if two domain-specific features are connected to many common

domain-independent features.

if two domain-independent features are connected to many

common domain-specific features.

Page 12: Cross domain sentiment classification via spectral feature alignment

Spectral feature clustering(1/2)

Given the feature bipartite graph G, our goal is to learn a feature

alignment mapping function

where m is the number of all features, l is the number of domain-

independent features and m-l is the number of domain-specific

features, k is the number of principle components.

Page 13: Cross domain sentiment classification via spectral feature alignment
Page 14: Cross domain sentiment classification via spectral feature alignment

Feature augmentation(1/2)

In practice, we may not be able to identify domain-

independent features correctly and thus fail to perform

feature alignment perfectly.

A tradeoff parameter γ is used in this feature augmentation

to balance the effect of original features and new features.

So, for each data example xi, the new feature representation

is defined as

Page 15: Cross domain sentiment classification via spectral feature alignment
Page 16: Cross domain sentiment classification via spectral feature alignment

Experiments

Page 17: Cross domain sentiment classification via spectral feature alignment

Datasets

The first dataset is from Blitzer et al.

The second dataset is from Amazon, Yelp and Citysearch.

Each review is assigned a sentiment label, +1 or -1.

Construct 12 tasks for each dataset. (ex: dvds->kitchen,

dvds->books, …)

Page 18: Cross domain sentiment classification via spectral feature alignment

Overall comparison results

Page 19: Cross domain sentiment classification via spectral feature alignment
Page 20: Cross domain sentiment classification via spectral feature alignment

Conclusion

In our framework, we first build a bipartite graph between

domain-independent and domain-specific features.

We propose a SFA algorithm to align the domain-specific

words from the source and target domains into meaningful

clusters, with the help of domain-independent words as a

bridge.

Our experimental results demonstrate the effectiveness of

our proposed framework.