The Future of Knowledge Graphs Moving beyond Simple ...qngw2014.bj.bcebos.com/upload/kg3/KG 2015 -...

Preview:

Citation preview

The Future of Knowledge Graphs:Moving beyond

Simple Collections of Facts

Gerard de MeloAssistant Professor, Tsinghua University

http://gerard.demelo.org

The Future of Knowledge Graphs:Moving beyond

Simple Collections of Facts

Gerard de MeloAssistant Professor, Tsinghua University

http://gerard.demelo.org

25 Years of the World Wide Web:1989−2014

25 Years of the World Wide Web:1989−2014

http://geekcom.wordpress.com/2009/03/19/

Tim Berners-Lee

25 Years of the World Wide Web:1989−2014

25 Years of the World Wide Web:1989−2014

HyperText(the “HT” in

“HTML”)

HyperText(the “HT” in

“HTML”)

Basic Idea:Connecting Data

Basic Idea:Connecting Data

http://geekcom.wordpress.com/2009/03/19/

Tim Berners-Lee

25 Years of the World Wide Web:1989−2014

25 Years of the World Wide Web:1989−2014

http://geekcom.wordpress.com/2009/03/19/

Tim Berners-Lee

25 Years of the World Wide Web:1989−2014

25 Years of the World Wide Web:1989−2014

http://geekcom.wordpress.com/2009/03/19/

Tim Berners-Lee

The Semantic WebThe Semantic Web

http://geekcom.wordpress.com/2009/03/19/

Tim Berners-Lee

col-league

born in Frankfurt

describedby

created by

A Web ofmachine-readableentity-relationshipdata

A Web ofmachine-readableentity-relationshipdata

createdby

Frankfurt image: https://en.wikipedia.org/wiki/File:Frankfurt_Am_Main-Stadtansicht_von_der_Deutschherrnbruecke_am_fruehen_Abend-20110808.jpg

The Semantic WebThe Semantic Web

http://geekcom.wordpress.com/2009/03/19/

Tim Berners-Lee

A Web ofmachine-readableentity-relationshipdata

A Web ofmachine-readableentity-relationshipdata

Frankfurt image: https://en.wikipedia.org/wiki/File:Frankfurt_Am_Main-Stadtansicht_von_der_Deutschherrnbruecke_am_fruehen_Abend-20110808.jpg

Problem:Just an academicidea, not used inpractice!

Problem:Just an academicidea, not used inpractice!

Big Knowledge CollectionsBig Knowledge Collections

Big Knowledge CollectionsBig Knowledge CollectionsBig Knowledge CollectionsBig Knowledge Collections

https://commons.wikimedia.org/wiki/File:Encyclopedia_Britannica_in_the_library_of_The_Kings_School,_Goa.jpg

Big Knowledge CollectionsBig Knowledge CollectionsBig Knowledge CollectionsBig Knowledge Collections

Rob Matthews: printed small sample of Wikipedia

Actually, a printedWikipedia corresponds to2000 Britannica volumes

Source:http://www.labnol.org/internet/wikipedia-printed-book/9136/

Actually, a printedWikipedia corresponds to2000 Britannica volumes

Source:http://www.labnol.org/internet/wikipedia-printed-book/9136/

Background: 2006/7Background: 2006/7

Image: weinbrenner.single.arabzadeh. Architektenwerkgemeinschafthttp://www.wsa-nt.de

Computer Science branchof Max Planck Society, whichhas had 33 Nobel Prize winners

Computer Science branchof Max Planck Society, whichhas had 33 Nobel Prize winners

Big Knowledge CollectionsBig Knowledge Collections

YAGO2. Hoffart et al. WWW 2011.

YAGO2. Hoffart et al. WWW 2011.

Gerard de Melo

YAGO 1: 2007 YAGO 1: 2007

Big Knowledge GraphsBig Knowledge Graphs

Big Knowledge GraphsBig Knowledge Graphs

Gerard de Melo

Google Knowledge GraphGoogle Knowledge GraphGoogle Knowledge GraphGoogle Knowledge Graph

Gerard de Melo

Mainstream AdoptionMainstream AdoptionMainstream AdoptionMainstream Adoption

Mainstream AdoptionMainstream AdoptionMainstream AdoptionMainstream Adoption

Mainstream AdoptionMainstream AdoptionMainstream AdoptionMainstream Adoption

Mainstream AdoptionMainstream AdoptionMainstream AdoptionMainstream Adoption

Solved:Solved:The Low-Hanging FruitThe Low-Hanging Fruit

Solved:Solved:The Low-Hanging FruitThe Low-Hanging Fruit

What we should be able to askWhat we should be able to askWhat we should be able to askWhat we should be able to ask

Europe: Could be UK, France, Germany, etc.!Europe: Could be UK, France, Germany, etc.!

What we should be able to askWhat we should be able to askWhat we should be able to askWhat we should be able to ask

Spring: March, April, etc.Spring: March, April, etc.

What we should be able to askWhat we should be able to askWhat we should be able to askWhat we should be able to ask

What we should be able to askWhat we should be able to askWhat we should be able to askWhat we should be able to ask

Are any talks at KG 2015about social media?

The following talks at KG 2015seem related to social media:

...Image: Microsoft Cortana

What we should be able to askWhat we should be able to askWhat we should be able to askWhat we should be able to ask

Which talks at KG 2015 are mostrelated to my current projects?

The following talks at KG 2015seem most related to your work:

...Image: Microsoft Cortana

Beyond Collections of FactsBeyond Collections of Facts

1. Knowledge Integration and Organization

2. Natural Language Coupling

3. Multimodal and Cognitive Knowledge

Beyond Collections of FactsBeyond Collections of Facts

1. Knowledge Integration and Organization

2. Natural Language Coupling

3. Multimodal and Cognitive Knowledge

The Web of DataThe Web of Data

LinkedData

The Web of DataThe Web of Data

LinkedIntegrated

Data?

Linked Data Is Still NotIntegrated Data

Linked Data Is Still NotIntegrated Data

Technical Challenges

● Different access mechanisms:Individual HTTP requests, SPARQL endpoints,Downloads

● Downloads often not available

● Servers offline

Technical Challenges

● Different access mechanisms:Individual HTTP requests, SPARQL endpoints,Downloads

● Downloads often not available

● Servers offline

Linked Data Is Still NotIntegrated Data

Linked Data Is Still NotIntegrated Data

Linked Data Is Still NotIntegrated Data

Linked Data Is Still NotIntegrated Data

https://commons.wikimedia.org/wiki/File:Gov%27t_Shutdown,_Locked.._-_Flickr_-_daveynin.jpg

A lot of dataA lot of datais locked awayis locked awayto some extentto some extent

A lot of dataA lot of datais locked awayis locked awayto some extentto some extent

Linked Data Is Still NotIntegrated Data

Linked Data Is Still NotIntegrated Data

Semantic Challenges

● Different ontologies/schemas/vocabularies

● Links are incomplete

● Links are bad

● Data is sometimes bad or outdated

Semantic Challenges

● Different ontologies/schemas/vocabularies

● Links are incomplete

● Links are bad

● Data is sometimes bad or outdated

Disconnected InformationDisconnected Information

IsA ΦPhilosopher

RudolfCarnap

Studied at Universityof

BerlinCarnap

Was Professor at

Einstein

bornInAlbert Einstein HU Berlin

Ulm

...

Current SituationCurrent SituationCurrent SituationCurrent Situation

Alexandre Duret-Lutz https://www.flickr.com/photos/gadl/110845690/

Goal: Integrated andOrganized KnowledgeGoal: Integrated and

Organized Knowledge

Theological Hall, Strahov Monastery Library, Prague

EntityIntegration

EntityIntegration

Entity IntegrationEntity Integration

Source: Bordes & Gabrilovich. KDD 2014 Tutorial

Source: Peter Mika

Entity Integration:Challenges

Entity Integration:Challenges

Entity-Level Integration

Gerard de Melo

Language-specific,Language-specific,Domain-specific,Domain-specific,Arbitrary DatabasesArbitrary Databases

Language-specific,Language-specific,Domain-specific,Domain-specific,Arbitrary DatabasesArbitrary Databases

Entity Integration:Computing Similarity Scores

Entity Integration:Computing Similarity Scores

Are these two the same? Are these two the same?

Compare attributes, relations, textCompare attributes, relations, text→ Similarity score (weighted link)→ Similarity score (weighted link)

Are these two the same? Are these two the same?

Compare attributes, relations, textCompare attributes, relations, text→ Similarity score (weighted link)→ Similarity score (weighted link)

ACL 2010AAAI 2013ACL 2010AAAI 2013

UseIdentity Linksto connectWhat isequivalent

Merging Structured DataMerging Structured DataMerging Structured DataMerging Structured Data

Merging Structured DataMerging Structured DataMerging Structured DataMerging Structured Data

ACL 2010AAAI 2013ACL 2010AAAI 2013

Merging Structured DataMerging Structured Data

Trentino Trentino-Alto Adige

Merging Structured DataMerging Structured DataMerging Structured DataMerging Structured Data

One bad link is One bad link is enough to make aenough to make aconnected component connected component inconsistentinconsistent

One bad link is One bad link is enough to make aenough to make aconnected component connected component inconsistentinconsistent

ACL 2010AAAI 2013ACL 2010AAAI 2013

OptimizationOptimizationProblem:Problem:

Make clustersMake clustersconsistentconsistent

OptimizationOptimizationProblem:Problem:

Make clustersMake clustersconsistentconsistent

Either by deletingEither by deletingedges or mergingedges or merging

conceptsconcepts

Either by deletingEither by deletingedges or mergingedges or merging

conceptsconcepts

Merging Structured DataMerging Structured DataMerging Structured DataMerging Structured Data

ACL 2010AAAI 2013ACL 2010AAAI 2013

Min. cost solution:Min. cost solution:NP-hardNP-hard

APX-hardAPX-hard

Min. cost solution:Min. cost solution:NP-hardNP-hard

APX-hardAPX-hard

Merging Structured DataMerging Structured DataMerging Structured DataMerging Structured Data

ACL 2010AAAI 2013ACL 2010AAAI 2013

Finally, use region growingFinally, use region growingalgorithm in the spiritalgorithm in the spirit

of Leighton & Rao 1988of Leighton & Rao 1988

Finally, use region growingFinally, use region growingalgorithm in the spiritalgorithm in the spirit

of Leighton & Rao 1988of Leighton & Rao 1988

Linear Program RelaxationLinear Program RelaxationLinear Program RelaxationLinear Program Relaxation

Approximation Guarantee:Approximation Guarantee:4ln(nq+1) 4ln(nq+1)

for n distinctness assertions,for n distinctness assertions,q=max |Dq=max |D

i,ji,j||

but independent of |Dbut independent of |Dii| !| !

Approximation Guarantee:Approximation Guarantee:4ln(nq+1) 4ln(nq+1)

for n distinctness assertions,for n distinctness assertions,q=max |Dq=max |D

i,ji,j||

but independent of |Dbut independent of |Dii| !| !

Merging Structured DataMerging Structured DataMerging Structured DataMerging Structured Data

Linear Program RelaxationLinear Program RelaxationLinear Program RelaxationLinear Program Relaxation

Nice:Nice:

This generalizes theThis generalizes theHungarian AlgorithmHungarian Algorithm

from bipartite matchingsfrom bipartite matchingsto many non-standard kindsto many non-standard kinds

of matchingsof matchings

(cf. de Melo. AAAI 2013)(cf. de Melo. AAAI 2013)

Nice:Nice:

This generalizes theThis generalizes theHungarian AlgorithmHungarian Algorithm

from bipartite matchingsfrom bipartite matchingsto many non-standard kindsto many non-standard kinds

of matchingsof matchings

(cf. de Melo. AAAI 2013)(cf. de Melo. AAAI 2013)

Merging Structured DataMerging Structured DataMerging Structured DataMerging Structured Data

TaxonomicIntegrationTaxonomicIntegration

Question Answering

IBM's Jeopardy!-winning Watson system

Gerard de Melo

Question Answering

IBM's Jeopardy!-winning Watson system

Gerard de Melo

Question Answering

IBM's Jeopardy!-winning Watson system

Gerard de Melo

De Melo & Weikum (2010).CIKM Best Interdisciplinary Paper AwardDe Melo & Weikum (2010).CIKM Best Interdisciplinary Paper Award

Taxonomic Integration:Taxonomic Integration:MENTA ApproachMENTA Approach

Taxonomic Integration:Taxonomic Integration:MENTA ApproachMENTA Approach

De Melo & Weikum (2010).CIKM Best Interdisciplinary Paper AwardDe Melo & Weikum (2010).CIKM Best Interdisciplinary Paper Award

Taxonomic Integration:Taxonomic Integration:MENTA ApproachMENTA Approach

Taxonomic Integration:Taxonomic Integration:MENTA ApproachMENTA Approach

De Melo & Weikum (2010).CIKM Best Interdisciplinary Paper AwardDe Melo & Weikum (2010).CIKM Best Interdisciplinary Paper Award

Taxonomic Integration:Taxonomic Integration:MENTA ApproachMENTA Approach

Taxonomic Integration:Taxonomic Integration:MENTA ApproachMENTA Approach

De Melo & Weikum (2010).CIKM Best Interdisciplinary Paper AwardDe Melo & Weikum (2010).CIKM Best Interdisciplinary Paper Award

Taxonomic Integration:Taxonomic Integration:MENTA ApproachMENTA Approach

Taxonomic Integration:Taxonomic Integration:MENTA ApproachMENTA Approach

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Image: https://de.wikipedia.org/wiki/Datei:Bersntol_palae.jpg

Fersental(Bersntol, Valle dei Mòcheni)Fersental(Bersntol, Valle dei Mòcheni)

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

https://de.wikipedia.org/wiki/Datei:Language_distribution_Trentino_2011.png

Fersental(Bersntol, Valle dei Mòcheni)Fersental(Bersntol, Valle dei Mòcheni)

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Use Identity ConstraintAlgorithm to form equivalence classes

Use Identity ConstraintAlgorithm to form equivalence classes

Markov Chain RandomWalk with Restartsto Rank Parents

Markov Chain RandomWalk with Restartsto Rank Parents

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

Taxonomic Integration:Taxonomic Integration:MENTAMENTA

KnowledgeIntegrationKnowledgeIntegration

Knowledge IntegrationKnowledge Integration

X isAuthorOf YY writtenBy XX wrote YY writtenInYear Z

UWN/MENTA

multilingual extension of WordNet forword senses and taxonomical information over 200 languages

Gerard de Melo

Search Interfaces

“Which companies were created during the last century in Silicon Valley ?”

YAGO2:WWW 2011

Best Demo Award

YAGO2:WWW 2011

Best Demo Award

Gerard de Melo

Example:Example:Lexvo.orgLexvo.orgExample:Example:Lexvo.orgLexvo.org

Semantic WebSemantic WebJournal 2014Journal 2014Semantic WebSemantic WebJournal 2014Journal 2014

InterdisciplinaryInterdisciplinaryWork, e.g. inWork, e.g. inDigital HumanitiesDigital Humanities

InterdisciplinaryInterdisciplinaryWork, e.g. inWork, e.g. inDigital HumanitiesDigital Humanities

Lexvo.orgLexvo.org

YAGO-SUMO: KnowledgeIntegration for ReasoningYAGO-SUMO: KnowledgeIntegration for Reasoning

SUMO

70000 axiomsopen source

SUMO

70000 axiomsopen source

YAGO-SUMO

SUMO's axiomsattached to

real-world instances,

overall many millions of axioms

YAGO-SUMO

SUMO's axiomsattached to

real-world instances,

overall many millions of axioms

Smarter Question AnsweringSmarter Question Answering

Sutcliffe et al.Sutcliffe et al.Sutcliffe et al.Sutcliffe et al.

Gerard de Melo

Beyond Collections of FactsBeyond Collections of Facts

1. Knowledge Integration and Organization

2. Natural Language Coupling

3. Multimodal and Cognitive Knowledge

What is the capital of New Jersey?

??

??

Question AnsweringQuestion Answering

Buy Jaguar in Beijing

Question AnsweringQuestion Answering

Which meaning?Which meaning?

JaguarJaguar

Buy Jaguar in Beijing

Question AnsweringQuestion Answering

“Do Macs now supportWindows?”

“Do Macs now supportWindows?”

Question AnsweringQuestion Answering

Lexical Knowledge Bases

Multilingual Lexical Knowledge

UWN (de Melo & Weikum 2009)

MENTA

UWN: Universal Wordnet

Gerard de Melo

Exploit millionsof translationsfrom the Web

Exploit millionsof translationsfrom the Web

UWN: “Universal Wordnet”

Learn regularized regression model

to predict edge weightsfor a given pair of nodes

Learn regularized regression model

to predict edge weightsfor a given pair of nodes

Graph-based features, weighted by various edge weight metrics

Graph-based features, weighted by various edge weight metrics

Gerard de Melo

CIKM 2009CIKM 2009

UWN: Universal Wordnet

over 1,000,000 words in over 100 languages

CIKM 2009CIKM 2009 ICGL 2008ICGL 2008Best Paper AwardBest Paper AwardICGL 2008ICGL 2008Best Paper AwardBest Paper Award

Gerard de Melo

UWN: Getting StartedUWN: Getting Started

Simple API for JVM Languages

val uwn = new UWN(new File("plugins/"))for (m <- uwn.getMeanings("souris", "fra"))

println(m)

Or Just Download the TSV File

Simple API for JVM Languages

val uwn = new UWN(new File("plugins/"))for (m <- uwn.getMeanings("souris", "fra"))

println(m)

Or Just Download the TSV File

http://lexvo.org/uwn/

UWNUWNUWNUWN

http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/

UWNUWNUWNUWN

http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/

UWNUWNUWNUWN

http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/

UWNUWNUWNUWN

http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/ http://www.lexvo.org/uwn/

LREC 2014LREC 2014

Etymological WordnetEtymological Wordnet

Real Understanding?Real Understanding?

Knowledge Bases keep growing, butmuch of the Web is still not truly understood

Knowledge Bases keep growing, butmuch of the Web is still not truly understood

Equivalent:

MetaWeb was acquired by Google.MetaWeb was just recently acquired by Google.MetaWeb, surprisingly, was acquired by Google.

Relation UnderstandingRelation Understanding

MetaWeb was bought out by Google.Google bought MetaWeb.Google acquired MetaWeb.MetaWeb was sold to Google.Google's acquisition of MetaWeb.Google's MetaWeb acquisition.and so on...

Underlying frame: Commercial transfer

Capture the “who-did-what-to-whom”

Microsoft bought the patent from Nokia.Nokia sold the patent to Microsoft.The patent was acquired by Microsoft [from Nokia].The patent was sold [by Nokia] to Microsoft.

Relation Understanding:N-Ary Relations

Relation Understanding:N-Ary Relations

Buyer: Microsoft

Seller: Nokia

Product: The patent

Relation IntegrationRelation Integration

YAGO: isMarriedTo predicateYAGO: isMarriedTo predicate

Freebase: Marriage EntityFreebase: Marriage Entity

Challenge:Modelling

Differences

Challenge:Modelling

Differences

Relation Integration:FrameBase.org

Bringing knowledge into a standard formbased on natural language (FrameNet)

Bringing knowledge into a standard formbased on natural language (FrameNet)

Background:Berkeley FrameNet Project

Background:Berkeley FrameNet Project

Beyond Collections of FactsBeyond Collections of Facts

1. Knowledge Integration and Organization

2. Natural Language Coupling

3. Multimodal and Cognitive Knowledge

MENTA: MediaMENTA: Media

Knowlywood: Human ActivitiesKnowlywood: Human Activities

CIKM 2015CIKM 2015

Knowlywood: Human ActivitiesKnowlywood: Human Activities

CIKM 2015CIKM 2015

Learning Common-SenseLearning Common-SenseLearning Common-SenseLearning Common-Sense

WebChild

AAAI 2014WSDM 2014AAAI 2011

WebChild

AAAI 2014WSDM 2014AAAI 2011

Online: http://bit.ly/webchild

Learning Common-SenseLearning Common-SenseLearning Common-SenseLearning Common-Sense

WebChild

(Tandon et al.WSDM 2014):

WebChild

(Tandon et al.WSDM 2014):

Online: http://bit.ly/webchild

Comparative KnowledgeComparative KnowledgeComparative KnowledgeComparative Knowledge

AAAI 2014AAAI 2014

Joint Disambiguation:<bullet train, quicker than, bus>

<train, slower than, plane><plane, faster than, train><bus, slower than, plane>

Joint Disambiguation:<bullet train, quicker than, bus>

<train, slower than, plane><plane, faster than, train><bus, slower than, plane>

Online: http://bit.ly/webchild

ImpLex:Implicative Commitments

Netflix admitted that they had developed

a prototype.

VSoft pretended that they had developed

a prototype.

Did theydevelop a

prototype?

Gerard de Melo

de Melo & de Paiva (2014)based onLauri Kartunnen's work

ImpLex:Implicative Commitments

It is confusing thatApple has acquired

the technology.

It is unlikely thatApple has acquired

the technology.

Have theyacquired it?

?

de Melo &de Paiva 2014

Word VectorsWord VectorsWord VectorsWord Vectors

x x

x xpetronia

sparrow

parched

aridxdry

x bird

http://www.wikihow.com/Read-a-Book-to-a-Baby-or-Infant#/Image:Read-a-Book-to-a-Baby-or-Infant-Step-5.jpg

Why are Why are Word Vectors Successful?Word Vectors Successful?

Why are Why are Word Vectors Successful?Word Vectors Successful?

Matej Kren: Idiom. Prague Municipal Libraryhttps://www.flickr.com/photos/ill-padrino/6437837857/

● Fast Training on Large Amounts of Text● Joint Optimization (Mutual Influence

between Word Vectors)

● Fast Training on Large Amounts of Text● Joint Optimization (Mutual Influence

between Word Vectors)

Jiaqiang Chen and Gerard de Melo 2015

What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?

The Roman Empire was remarkablymulticultural, with ”a rather astonishingcohesive capacity” to create a senseof shared identity while encompassingdiverse peoples within its political system over a long span of time.

Jiaqiang Chen and Gerard de Melo 2015

What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?

The Roman Empire was remarkablymulticultural, with ”a rather astonishingcohesive capacity” to create a senseof shared identity while encompassingdiverse peoples within its political system over a long span of time.

syntactic

Jiaqiang Chen and Gerard de Melo 2015

What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?

The Roman Empire was remarkablymulticultural, with ”a rather astonishingcohesive capacity” to create a senseof shared identity while encompassingdiverse peoples within its political system over a long span of time.

syntactic semantic!

Jiaqiang Chen and Gerard de Melo 2015

What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?

The Roman Empire was remarkablymulticultural, with ”a rather astonishingcohesive capacity” to create a senseof shared identity while encompassingdiverse peoples within its political system over a long span of time.

semantic!syntactic syntactic?

Jiaqiang Chen and Gerard de Melo 2015

What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?

The Roman Empire was remarkablymulticultural, with ”a rather astonishingcohesive capacity” to create a senseof shared identity while encompassingdiverse peoples within its political system over a long span of time.

semantic!syntactic syntactic? ?

Jiaqiang Chen and Gerard de Melo 2015

What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?

The Roman Empire was remarkablymulticultural, with ”a rather astonishingcohesive capacity” to create a senseof shared identity while encompassingdiverse peoples within its political system over a long span of time.

semantic!syntactic ?

Word2Vec Solution:Subsampling

Word2Vec Solution:Subsampling

syntactic?

Jiaqiang Chen and Gerard de Melo 2015

What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?What Goes into Word Vectors?

The Roman Empire was remarkablymulticultural, with ”a rather astonishingcohesive capacity” to create a senseof shared identity while encompassingdiverse peoples within its political system over a long span of time.

semantic!syntactic ?syntactic?

Jiaqiang Chen and Gerard de Melo 2015

Textual Big DataTextual Big DataTextual Big DataTextual Big Data

Image: https://500px.com/photo/96738645/street-bookstore-by-pablo-tarrero

Better Word Vectorsby Exploiting Knowledge

Better Word Vectorsby Exploiting Knowledge

BetterWord Embeddings

Joint Training

Jiaqiang Chen and Gerard de Melo 2015

Best Paper Awardat NAACL 2015Vector Space

Modeling Workshop

Best Paper Awardat NAACL 2015Vector Space

Modeling Workshop

Textual Big Data Knowledge

Multilingual Word Vectors

Combine word2vec/GloVe Vectorswith multilingual lexical data from Wiktionary

Gerard de Melo

Available soon athttp://gerard.demelo.org/representations/

Contact me!

Available soon athttp://gerard.demelo.org/representations/

Contact me!

Multilingual Word Vectors

Combine word2vec/GloVe Vectorswith multilingual lexical data from Wiktionary

Gerard de Melo

Available soon athttp://gerard.demelo.org/representations/

Contact me!

Available soon athttp://gerard.demelo.org/representations/

Contact me!

Multilingual Analogical ReasoningMultilingual Analogical ReasoningMultilingual Analogical ReasoningMultilingual Analogical Reasoning

Coffee is to Starbucksas ... is to Lipton?

System: Tea

Source vectors: GLOVE

What we should be able to getWhat we should be able to getin the futurein the future

What we should be able to getWhat we should be able to getin the futurein the future

What is thisall about?

This is about the US budget debt ceiling,which needs to be extended. RepublicanSenate majority leader Mitch McConell

would like to...

Image: Microsoft Cortana

Huffington Post

Summary

Knowledge Integration and Organization► MENTA taxonomy► Lexvo.org Linked Data

Natural Language Coupling► Universal WordNet► FrameBase.org

Multimodal and Cognitive Knowledge► Knowlywood► WebChild

More Information:

www.demelo.orggdm@demelo.org

More Information:

www.demelo.orggdm@demelo.org

Gerard de Melo

Recommended