13
Collective Classification A brief overview and possible connections to email-acts classification Vitor R. Carvalho Text Learning Group Meetings, Carnegie Mellon University November 10 th 2004

Vitor R. Carvalho Text Learning Group Meetings, Carnegie Mellon University November 10 th 2004

  • Upload
    lenore

  • View
    34

  • Download
    0

Embed Size (px)

DESCRIPTION

Collective Classification A brief overview and possible connections to email-acts classification. Vitor R. Carvalho Text Learning Group Meetings, Carnegie Mellon University November 10 th 2004. Data Representation. spam. Not spam. “Flat” Data Object: email msgs - PowerPoint PPT Presentation

Citation preview

Page 1: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Collective Classification A brief overview and possible connections to

email-acts classification

Vitor R. Carvalho

Text Learning Group Meetings,

Carnegie Mellon University

November 10th 2004

Page 2: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Data Representation

• “Flat” Data – Object: email msgs– Attributes: words, sender, etc– Class: spam/not spam– Usually assumed IID

• Sequential Data– Object: words in text– Attr: capitalized, number, dict– Class: POS (or name/not)

• Relational Data– class+attributes– +links(relations)– Example: webpages

pron namedetnameverb

spamspam

spam Not spam

Not spam

Page 3: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

J. Neville et al., 2003

Page 4: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Relational Data and Collective Classification

•Different objects interact

•Different types of relations (links)

•Attributes may be correlated

•Examples: – actors, directors, movies, companies– papers, authors, conferences, citations– company, employee, customer,

Classify objects collectively

Use prediction on some objects to improve prediction on related objects

Page 5: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Collective Classification Methods

• Relational Probability Trees (RPT)

• Iterative methods (Relaxation-based Methods)

• Relational Dependency Networks (RDN)

• Relational Bayesian Networks (RBN/PRM)

• Relational Markov Networks (RMN)

• Other models (ILP based, Vector Space based, etc)

•Overall:

– Lack of direct comparison among methods

– Results are usually compared to “flat” model

– Splitting data into train/test sets can be an issue

Page 6: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Relational Probability Trees

• Decision Trees applied to Relational data

• Predicts the target class label based on:– same object attributes– attributes + links in “relational neighborhood” (one link away)– counts of attributes and links in the “neighborhood”

• Enhanced feature selection (Chi-square, pruning, randomization tests)

• Results were not exciting

•Neville et al. KDD2003, related work from Blockeel et al. (Artificial Intelligence, 1998), Kramer AAAI-96

Page 7: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Iterative Methods

• Predicts the target class label based on:– Same object attributes– Attributes and links of relational

neighborhood– CLASS LABEL of neighborhood– Features derived from CLASS LABELS

• Different update strategies:– By threshold in prediction confidence

– By top-N most confident predictions

– Heuristic-based

• Slattery & Mitchell, ICML-2000;Neville & Jensen, AAAI-2000; Chakrabarti et al. ACM-SIGMOD-98

• Some results with Email-acts

Page 8: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Relational Bayesian Networks (RBN/PRM)

• Bayes Net extended to Relational domain

• Given an “instantiation”, it induces a bayes-net that specifies a joint probability distribution over all attributes of all entities

• Directed graphical model, with acyclicity constraint.

• Exact model - Closed form for parameter estimation – Products of conditional probabilities

• Was applied to simple domains, since the acyclicity constraints is very restrictive to most relational applications

• Friedman et al, IJCAI-99; Getoor et al., ICML-2001; Taskar et al. IJCAI-2001

Page 9: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Relational Markov Networks (RMN)

• Extension of CRF idea to Relational Domain

• Given an instantiation, it induces a Markov Network that specifies a probability distribution of labels, given links and attributes

• Undirected, Discriminative model

• Parameter estimation is expensive, requires approximate probabilistic inference (belief propagation)

•Taskar et al., UAI2002

Page 10: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Relational Dependency Networks (RDN)

• Dependency Networks extended to Relational domain

• P(X) = π [ Prob (Xi | Neighbor(Xi)) ]

• Given an “instantiation”, it induces a DN that specifies an “approximate” joint probability distribution over all attributes of all objects

• Undirected graphical model, no acyclicity constraint.

• Approximate model - Simple parameter estimation – approximate inference (Gibbs sampling)

• Neville & Jensen, KDD-MRDM-2003

Page 11: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Other Models

From Neville et al., 2003

Page 12: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

Comparing Some Results

• Comparing PRM, RMN, SVM and M^3N

• Diff: PRM and RMN• Diff: mSVM and RMN

• RN* (Relational Neighbor) is a very simple Relational Classifier

• RN* (Macskassy et al., 2003)• M^3N(Taskar et al., 2003)

PRM

RMN

Page 13: Vitor R. Carvalho Text Learning Group Meetings,  Carnegie Mellon University November 10 th  2004

End of overview…now, the email-act problem

DeliveryRequest

CommitProposalRequest

Commit

Commit

Delivery

Request

Proposal

Delivery

Acknowled

Request

Time

• Strong correlation with previous and next message

• Flat data?

• Sequential data?

• A “verb” has little or no correlation with other “verbs” of same message