46
CACAO Conditional Spread Activation for Keyword Query Interpretation 15th Internation Conference on Semantic Web Karlsruhe 2019 Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI

DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

  • Upload
    others

  • View
    11

  • Download
    0

Embed Size (px)

Citation preview

Page 1: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

CACAOConditional Spread Activationfor Keyword Query Interpretation

15th Internation Conference on Semantic WebKarlsruhe 2019

Edgard Marx, Gustavo Publio & Thomas Riechert

LiberAI

Page 2: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

2

openQA

*https://github.com/AKSW/openQA

Previous Works

LiberAI

Page 3: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

3

KBox

KBox

Is the Knowledge

Graph in KBox?

YES

NOResolve

User provides SPARQLquery and the target Knowledge Graph URI

ExecuteSPARQL Query

DDNS

User/Application

SELECT ?s ?p ?o ...

+http://knowledgegraph.uri

Dereference

*https://github.com/AKSW/KBoxLiberAI

Previous Works

Page 4: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

4

NSpM

*https://github.com/AKSW/NSpMLiberAI

Previous Works

Page 5: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

5

DBNQA

*https://github.com/AKSW/DBNQA

https://wikipedia.org/wiki/List_of_datasets_for_machine-learning_research

Previous Works

LiberAI

Page 6: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Outline

• Motivation

• Problem Statment

• Background

• Approach

• Evaluation

• Counclusion & Future Works

6LiberAI

Page 7: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Resource Description

Framework (RDF)

Unstructured

Linked Data

Structured

Crowd

Linked Data

LiberAI

7

Motivation

Page 8: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

8

• Entity Retrieval• Question Answering• Entity Linking

Linked Data

Resource Description

Framework (RDF)Access (the information needed)

LiberAI

Motivation

Page 9: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

9

Resource Description

Framework (RDF)Entity Retrieval

Formally, a top-k entity retrieval takes a keyword query Q,

an integer 0 < k, a set of entities E={e1, e2, . . . , e|E|},

and returns the top-k entities based on

a scoring function S(Q, e)

LiberAI

Problem statement

Page 10: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

10

Resource Description

Framework (RDF)Entity Retrieval

Select * where {

?s dbo:birthPlace dbpedia:Leipzig

}

LiberAI

Problem statement

Page 11: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

TF-IDF

11

𝑤𝑡, 𝑑= 𝑡𝑓𝑡, 𝑑 ∙ 𝑙𝑜𝑔

|𝐷|

| 𝑑′ ∈ 𝐷 𝑡 ∈ 𝑑′

LiberAI

Background

Page 12: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

BM25

12

"This can opener can open any can that a can opener can and if this can opener cannot open any can that a can opener can, then this can opener is free"

“If this can opener does not open any can, then you can take it for free”

LiberAI

Background

Page 13: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

BM25

13LiberAI

Background

Page 14: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

BM25F

14LiberAI

Background

Page 15: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Field Weighted

15LiberAI

weight(p1) > weight(p2) > weight(p3) …

Background

Page 16: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Learning to Rank

16LiberAI

Background

Page 17: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

17LiberAI

Entity Link in Queries

Entity Linking

Query

Entity Candidate Generation

Ranking

Entities

Query Expansion

Query

Linked Entities

Entity Retrieval

Background

Page 18: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

General IR Architecture

18

Entity Retrieval, Entity Linking, Question Answering

Query

Entity Candidate Generation

Ranking

Result

Query Expansion

LiberAI

Background

Page 19: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Interaction

19

Entity Retrieval Entity Linking Question Answering

Query

Entity Candidate Generation

Ranking

Entities

Query Expansion

Query

Entity Candidate Generation

Ranking

Linked Entities

Query Expansion

Query

Entity Candidate Generation

Ranking

Answer

Query Expansion

LiberAI

Background

Page 20: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Lastest Architecture

20

Entity Retrieval Entity Linking

Query

Entity Candidate Generation

Ranking

Entities

Query Expansion

Query

Entity Candidate Generation

Ranking

Linked Entities

Query Expansion

LiberAI

Background

Page 21: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Lastest Architecture

21LiberAI

Background

Page 22: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Current Scenario

22

Entity Retrieval Entity Linking Question Answering

Query

Entity Candidate Generation

Ranking

Entities

Query Expansion

Query

Entity Candidate Generation

Ranking

Linked Entities

Query Expansion

Query

Entity Candidate Generation

Ranking

Answer

Query Expansion

LiberAI

Background

Page 23: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Return linked resources

23

Entity Retrieval Entity Linking Question Answering

Query

Entity Candidate Generation

Ranking

Linked Resources

Query Expansion

Query

Entity Candidate Generation

Ranking

Linked Entities

Query Expansion

Query

Entity Candidate Generation

Ranking

Answer

Query Expansion

LiberAI

1

Proposal

Page 24: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Return the answer

24

Entity Retrieval Entity Linking Question Answering

Query

Entity Candidate Generation

Ranking

Linked Resources +

Answer

Query Expansion

Query

Entity Candidate Generation

Ranking

Linked Entities

Query Expansion

Query

Entity Candidate Generation

Ranking

Answer

Query Expansion

LiberAI

1

2

Proposal

Page 25: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Challenge 1: *Local Term Frequency

25LiberAI

*not to confuse with global

dbr:Albert_Leadbr:Albert_Lea_Public_Library …

18

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …” …

7

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …”

Q: “Albert Lea”

Page 26: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Challenge 2: Cohesion

26LiberAI

Q: “Albert Lea”

dbr:Albert_Leadbr:Albert_Lea_Public_Library …

18

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …” …

7

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …”

A B

Page 27: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

27LiberAI

Q: “Albert Lea”

dbr:Albert_Leadbr:Albert_Lea_Public_Library …

18

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …” …

7

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …”

Preposition 1 [1][2]: maximize the number of tokens and reduce the number of mapped entities

A B

[1] SWJ, Shekarpour et al. 2015, [2] NAACL, Sakor et al. 2019

Challenge 2: Cohesion

Page 28: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

28LiberAI

Q: “Albert Lea”

dbr:Albert_Leadbr:Albert_Lea_Public_Library

18“Albert Lea …”

7“Albert Lea …”

Proposition 1 [1][2]: maximize the number of tokens and reduce the number of mapped entities

A B

Does NOT satisfy: Either A and B comply with the proposition 1

[1] SWJ, Shekarpour et al. 2015, [2] NAACL, Sakor et al. 2019

Challenge 2: Cohesion

Page 29: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

29LiberAI

Q: “Albert Lea”

dbr:Albert_Leadbr:Albert_Lea_Public_Library …

18

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …” …

7

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …”

Proposition 2 [3][4]: use the context

A B

[3] ISWC, Usbeck et al. 2014, [4] KCap, Moussallem et al. 2017

Challenge 2: Cohesion

Page 30: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

30LiberAI

Q: “Albert Lea”

Proposition 2 [3][4]: use the context

Does NOT satisfy: Either A and B comply with preposition 2

[3] ISWC, Usbeck et al. 2014, [4] KCap, Moussallem et al. 2017

dbr:Albert_Leadbr:Albert_Lea_Public_Library

18“Albert Lea …”

7“Albert Lea …”

A B

Challenge 2: Cohesion

Page 31: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

31LiberAI

Q: “strawberry ice cream”

Proposition 3 [5][6]: Max Similarity (Levenshtein and Jaccard)

dbr:Strawberry_ice_cream

“Strawberry_ice_cream”A

dbr:Banana_split

“strawberry”B

“ice cream”

[5] AAAI, Zhang et al. 2016, [6] VLDB, Zheng et al. 2019

Challenge 2: Cohesion

Page 32: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

32LiberAI

Q: “strawberry ice cream”

Proposition 3 [5][6]: Max Similarity (Levenshtein and Jaccard)

Does NOT satisfy: Either A and B comply with preposition 3

[5] AAAI, Zhang et al. 2016, [6] VLDB, Zheng et al. 2019

dbr:Strawberry_ice_cream

“Strawberry_ice_cream”A

dbr:Banana_split

“strawberry”B

“ice cream”

1

1/3

2/3

Challenge 2: Cohesion

Page 33: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Challenge 3: Scoring Function

33LiberAI

0

1

2

3

4

5

6

7

8

9

10

1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

BM25F

Score

Where is the answer?

Page 34: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Approach

34LiberAI

Can we do better?

Page 35: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Conditional Spread Activation

35LiberAI

dbr:Albert_Leadbr:Albert_Lea_Public_Library …

18

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …” …

7

1

“Albert Lea …”

“Albert Lea …”

“Albert Lea …”

s = 0

s = 0s > 0

s > 0

Q: “Albert Lea”Challenge 1: Local Term Frequency

A B

Approach

Page 36: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Field Weight

36LiberAI

dbr:Albert_Leadbr:Albert_Lea_Public_Library

1“Albert Lea …”

1“Albert Lea …”

s > 0 s > 0

Q: “Albert Lea”Challenge 1: Local Term FrequencyChallenge 2: Cohesion

weight(ptype) > weight(plabel) > weight(pothers)

A B

Approach

Page 37: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Field Weight

37LiberAI

dbr:Albert_Leadbr:Albert_Lea_Public_Library

1“Albert Lea …”

1“Albert Lea …”

Q: “Albert Lea”Challenge 1: Local Term FrequencyChallenge 2: Cohesion

𝑆 𝑄𝑖, 𝐿𝑖 = ∑𝑄𝑖𝐿𝑖 if ∑𝑄𝑖𝐿𝑖 = 1

A B

Approach

Page 38: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Field Weight

38LiberAI

dbr:Albert_Leadbr:Albert_Lea_Public_Library

1“Albert Lea …”

1“Albert Lea …”

1 > j(𝑄𝑖, 𝐿𝑖)

Q: “Albert Lea”Challenge 1: Local Term FrequencyChallenge 2: Cohesion

𝑆(𝑄𝑖, 𝐿𝑖) = ∑𝑄𝑖𝐿𝑖

22 = 4

A B

𝑆 𝑄𝑖, 𝐿𝑖 = ∑𝑄𝑖𝐿𝑖 if ∑𝑄𝑖𝐿𝑖 = 1

Approach

Page 39: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Field Weight

39LiberAI

Challenge 1: Local Term FrequencyChallenge 2: CohesionChallenge 3: Scoring Function

0

500

1000

1500

2000

2500

1 2 3 4 5 6

*PThe answer!!

Approach

Page 40: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Pipeline

40

Query

Entity Candidate Generation

Ranking

Result

Query Expansion

LiberAI

Evaluation

Page 41: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Benchmark

41LiberAI

Evaluation

Page 42: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

DBpedia-Entity

42LiberAI

Evaluation

Page 43: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

QALD-4

43LiberAI

Evaluation

Page 44: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Conclusion & Future Works

44LiberAI

• We have shown a promising algorithm that outperform state-of-the-art ER engines on standard benchmarks by addressing:

o Local Term Frequency;o Cohesion, and;o Score Function problems

• It is still too soon to trace conclusions (law of small numbers)achieves that• Improve the algorithm runtime so it can be used at scale

Page 45: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Acknowledgements

45

LiberAI

@theLiberAI @HTWKLeipzig @edgardmarx @akswgroup

Page 46: DBtrends Exploring Query Logs for Ranking RDF Data · 2019-09-19 · Edgard Marx, Gustavo Publio & Thomas Riechert LiberAI. 2 ... Framework (RDF) Unstructured Linked Data Structured

Acknowledgements

46

@theLiberAI @HTWKLeipzig @edgardmarx @akswgroup

Thank you!!!* 100+ GitHub stars

* 40+ forks