Upload
kareem
View
40
Download
0
Tags:
Embed Size (px)
DESCRIPTION
Coupled Bayesian Sets Algorithm for Semi-supervised Learning and Information Extraction. Saurabh Verma IIT (BHU), Varanasi Estevam R. Hruschka Jr. F ederal University of São Carlos, Brazil. http:// rtw.ml.cmu.edu. NELL: Never-Ending Language Learner. Inputs: initial ontology - PowerPoint PPT Presentation
Citation preview
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Bayesian Sets Algorithm for Semi-supervised Learning and
Information Extraction
Saurabh VermaIIT (BHU), Varanasi
Estevam R. Hruschka Jr.Federal University of São Carlos, Brazil
ECML/PKDD2012 Bristol, UK September, 26th, 2012
http://rtw.ml.cmu.edu
ECML/PKDD2012 Bristol, UK September, 26th, 2012
NELL: Never-Ending Language Learner
Inputs: initial ontology handful of examples of each predicate in ontology the web occasional interaction with human trainers
The task: run 24x7, forever• each day:
1. extract more facts from the web to populate the initial ontology
2. learn to read (perform #1) better than yesterday
ECML/PKDD2012 Bristol, UK September, 26th, 2012
NELL: Never-Ending Language Learner
Goal:• run 24x7, forever• each day:
1. extract more facts from the web to populate given ontology
2. learn to read better than yesterday
Today...Running 24 x 7, since January, 2010Input: • ontology defining ~800 categories and relations • 10-20 seed examples of each • 1 billion web pages (ClueWeb – Jamie Callan)
Result: • continuously growing KB with +1,300,000 extracted beliefs
ECML/PKDD2012 Bristol, UK September, 26th, 2012
NELL: Never-Ending Language Learner
http://rtw.ml.cmu.edu
ECML/PKDD2012 Bristol, UK September, 26th, 2012
The Problem with Semi-Supervised Bootstrap Learning
ParisPittsburghSeattleCupertino
ECML/PKDD2012 Bristol, UK September, 26th, 2012
The Problem with Semi-Supervised Bootstrap Learning
ParisPittsburghSeattleCupertino
… mayor of arg1… live in arg1
ECML/PKDD2012 Bristol, UK September, 26th, 2012
The Problem with Semi-Supervised Bootstrap Learning
ParisPittsburghSeattleCupertino
… mayor of arg1… live in arg1
San FranciscoAustindenial
ECML/PKDD2012 Bristol, UK September, 26th, 2012
The Problem with Semi-Supervised Bootstrap Learning
ParisPittsburghSeattleCupertino
… mayor of arg1… live in arg1
San FranciscoAustindenial
ECML/PKDD2012 Bristol, UK September, 26th, 2012
The Problem with Semi-Supervised Bootstrap Learning
ParisPittsburghSeattleCupertino
… mayor of arg1… live in arg1
San FranciscoAustindenial
arg1 is home of …… traits such as arg1
ECML/PKDD2012 Bristol, UK September, 26th, 2012
The Problem with Semi-Supervised Bootstrap Learning
ParisPittsburghSeattleCupertino
… mayor of arg1… live in arg1
San FranciscoAustindenial
arg1 is home of …… traits such as arg1
ECML/PKDD2012 Bristol, UK September, 26th, 2012
The Problem with Semi-Supervised Bootstrap Learning
ParisPittsburghSeattleCupertino
… mayor of arg1… live in arg1
San FranciscoAustindenial
arg1 is home of …… traits such as arg1
ECML/PKDD2012 Bristol, UK September, 26th, 2012
The Problem with Semi-Supervised Bootstrap Learning
ParisPittsburghSeattleCupertino
… mayor of arg1… live in arg1
San FranciscoAustindenial
arg1 is home of …… traits such as arg1
…
it’s underconstrained!!
Bayesian Sets (BS)
Given and , rank the elements of by how well they would “fit into” a set which includes
Define a score for each :
From Bayes rule, the score can be re-written as:
}{xD DDc D
cD
)()(
)(x
xx
pDp
score c
Dx
)()(),()(c
c
DppDpscore
xxx
Ghahramani & Heller; NIPS 2005
Bayesian Sets (BS)
Intuitively, the score compares the probability that x and Dc were generated by the same model with the same unknown parameters θ, to the probability that x and Dc came from models with different parameters θ and θ’.
Ghahramani & Heller; NIPS 2005
)()(),()(c
c
DppDpscore
xxx
Bayesian Sets (BS)
Intuitively, the score compares the probability that x and Dc were generated by the same model with the same unknown parameters θ, to the probability that x and Dc came from models with different parameters θ and θ’.
Ghahramani & Heller; NIPS 2005
)()(),()(c
c
DppDpscore
xxx
Bayesian Sets (BS)
Intuitively, the score compares the probability that x and Dc were generated by the same model with the same unknown parameters θ, to the probability that x and Dc came from models with different parameters θ and θ’.
Ghahramani & Heller; NIPS 2005
)()(),()(c
c
DppDpscore
xxx
Bayesian Sets (BS)
Intuitively, the score compares the probability that x and Dc were generated by the same model with the same unknown parameters θ, to the probability that x and Dc came from models with different parameters θ and θ’.
Ghahramani & Heller; NIPS 2005
)()(),()(c
c
DppDpscore
xxx
ECML/PKDD2012 Bristol, UK September, 26th, 2012
BS using NELL’s Ontology
Initial ontology: Everything
PersonCompany VegetableSport
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Initial ontology: Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
BS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus, run BS onceEverything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
BS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus, run BS onceEverything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
TexacoFacebook
DELL…
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
AristotleAlan Turing
Alexander Fleming…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
MarathonBaseball
Badminton…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbaneBeijingCairo
…
BS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
BS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
BS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
BS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus, iteratively run BSEverything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
Boeing
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
Freud
BasketballFootball
SwimmingTennisGolf
SoccerVolleyball
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Iterative BS using NELL’s OntologyZhang & Liu, 2011
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus, iteratively run BSEverything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
Aristotle
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
Marathon
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbane
Iterative BS using NELL’s OntologyZhang & Liu, 2011
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus, iteratively run BSEverything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
TexacoFacebook
DELL…
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
AristotleAlan Turing
Alexander Fleming…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
MarathonBaseball
Badminton…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbaneBeijingCairo
…
Iterative BS using NELL’s OntologyZhang & Liu, 2011
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Iterative BS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Iterative BS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
NELL: Coupled semi-supervised training of many functions
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Training Type 2:Structured Outputs, Multitask, Posterior
Regularization, Multilabel
Learn functions with the same input, different outputs, where we know some constraint
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Training Type 2:Structured Outputs, Multitask, Posterior
Regularization, Multilabel
Learn functions with the same input, different outputs, where we know some constraint
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Training Type 2:Structured Outputs, Multitask, Posterior
Regularization, Multilabel
Learn functions with the same input, different outputs, where we know some constraint
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Bayesian Sets (CBS)
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Bayesian Sets (CBS)
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Bayesian Sets (CBS)
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Bayesian Sets (CBS)
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Bayesian Sets (CBS)
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Coupled Bayesian Sets (CBS)
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahoo
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
MutuallyExclusive(Company,Person);MutuallyExclusive(Company,Sport);MutuallyExclusive(Company,City);MutuallyExclusive(Pearson,Sport);…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
Boeing
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
Boeing
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
Boeing
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
…AT&T
Boeing
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
Boeing
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
…AT&T
Boeing
BasketballFootball
SwimmingTennisGolf…
AT&TBoeing
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
Boeing
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
…AT&T
Boeing
BasketballFootball
SwimmingTennisGolf…
AT&TBoeing
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town…
AT&TBoeing
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
Boeing
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
…Brazil Telecom
Texaco
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
…Brazil Telecom
Texaco
BasketballFootball
SwimmingTennisGolf…
Brazil TelecomTexaco
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
…Brazil Telecom
Texaco
BasketballFootball
SwimmingTennisGolf…
Brazil TelecomTexaco
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town…
Brazil TelecomTexaco
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
Given a huge web corpus and mutually exclusiveness constraints, iteratively run BS
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco
Peter FlachBill ClintonJeremy Lin
AdeleBarak Obama
BasketballFootball
SwimmingTennisGolf
BristolPittsburgh
Rio de JaneiroTokyo
Cape Town
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco…
Great BritainKeyboard
Pencil
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
Aristotle…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
Marathon…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbane
…
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco…
Great BritainKeyboard
Pencil
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
Aristotle…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
Marathon…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbane
…
Not Company
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco…
Great BritainKeyboard
Pencil
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
Aristotle…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
Marathon…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbane
…
Not Company
Great BritainKeyboard
Pencil
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco…
Great BritainKeyboard
Pencil
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
Aristotle…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
Marathon…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbane
…
Not Company
Great BritainKeyboard
Pencil
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco…
Great BritainKeyboard
Pencil
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
Aristotle…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
Marathon…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbane
…
Not Company
Great BritainKeyboard
Pencil
MutuallyExclusive(Company,NotCompany);
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco…
Great BritainKeyboard
Pencil
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
Aristotle…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
Marathon…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbane
…
Not Company
Great BritainKeyboard
Pencil
MutuallyExclusive(Company,NotCompany);
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s OntologyWhat if we do not have the mutual exclusiveness constraints?
Everything
PersonCompany CitySport
AppleMicrosoftGoogle
IBMYahooAT&T
BoeingBrazil Telecom
Texaco…
Great BritainKeyboard
Pencil
Peter FlachBill ClintonJeremy Lin
AdeleBarak ObamaDalai Lama
FreudTom Mitchell
Aristotle…
BasketballFootball
SwimmingTennisGolf
SoccerVolleyballJogging
Marathon…
BristolPittsburgh
Rio de JaneiroTokyo
Cape TownNew YorkLondon
Sao PauloBrisbane
…
Not Company
Great BritainKeyboard
Pencil
MutuallyExclusive(Company,NotCompany);
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What if we do not have the mutual exclusiveness constraints?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What if we do not have the mutual exclusiveness constraints?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What if we do not have the mutual exclusiveness constraints?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What if we do not have the mutual exclusiveness constraints?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What if we do not have the mutual exclusiveness constraints?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What if we do not have the mutual exclusiveness constraints?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What if we do not have the mutual exclusiveness constraints?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What about Semantic Relations?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What about Semantic Relations?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
CBS using NELL’s Ontology
What about Semantic Relations?
ECML/PKDD2012 Bristol, UK September, 26th, 2012
ConclusionsCoupled Bayesian Sets
• semi-supervised learning approach to extract category instances (e.g. country(USA), city(New York) from web pages;
• based on the original Bayesian Sets • can outperform algorithms such as the original Bayesian
Set, the Naive Bayes classifier, the Bas-all and the coupled semi-supervised logistic regression algorithm (CPL);
• can be used to automatically generate new constraints to the set expansion task even when no mutually exclusiveness relationship is previously defined
ECML/PKDD2012 Bristol, UK September, 26th, 2012
AcknowledgementsThanks to:
ECML/PKDD2012 audience! Also Thanks to:
• Department of Science and Technology, Government of India under Indo-Brazil cooperation programme;
• Brazilian research agencies CAPES and CNPq;
• Zoubin Ghahramani and K.A. Heller;
• All the Read The Web group;
contact: [email protected]://rtw.ml.cmu.edu