View
223
Download
0
Category
Preview:
Citation preview
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
1/33
ICSC 2014
16/6/2014
Enterprise “Knowledge Graphs“when "Web of Data" technologies make a lot of sense
in business scenarios.
Dr. Giovanni Tummarello
FBK- Italy
Sindicetech - Ireland
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
2/33
BackgroundSindice.com (2007-end
2011)Pushing search engine tech to the W
Data
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
3/33
How we !ar!ed"Sindice.com• Goal: #ui$ding and e%&eCen!ra$ied Seman!ic *
• Started wi!' 1 mac'ine
200.• Stopped craw$ing 20127003 &age, 50B !ri&$
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
4/33
Sindice.com &i&e$ine ke
Craw$er
Si!ema&ogic
ing+
a!aS!aging
8%!rac!ion
8n'ancemen!
HBae
9na$;!ic
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
5/33
ec'nica$ mi$e!one(Sindice.com) *e# of a!a ac:uii!ion• Seman!ic Si!ema& erage of !radi!iona$
i!ema&
• S!ar!ed and o&en-ourced Any23.org (?ow an 9&ac'e !o&$e>e$ &ro@ec!A)
8nric'men!/*e# Sca$e eaoning• u$$ reaoning ma!eria$ied #efore inde%ing, reaoning done
fe!c'ing &ro&er!ie and on!o$ogie on$ine, &ar!ia$ reaoningreu$! reued made i! ecien!, 'adoo& dri>en
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
6/33
Sindice.com for da!awe#ma!er+ (1)
owered #; Hadoo& ana$;!ic, Sindice *'ow da!a a a w'o$e &er we#i!e. 8m&in!erconnec!ion #e!ween en!i!ie #o!' wion diJeren! &age) and acro we#i!e/dSummar; gra&'+ i 600L3 !ri&$e i!e$f.
"//demo.indice.ne!/da!ae!/
Indi>idua$ &age marku& in&ec!or u$$ we#i!e marku&
http://demo.sindice.net/dataset/http://demo.sindice.net/dataset/
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
7/33
Sindice.com for da!awe#ma!er+ (2)
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
8/33
• Sindice Site Services!"#$%&:#ut simple mar'up Get data po(ered(idgets and site searc)*
• Sig.maSindice +ased mas)ups on t)e ,y.
Sindice.com ome a&&$ica
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
9/33
8nd of 2011• Sc'ema.org #ooming
• *e# of a!a $aunc'ed+ a main!ream
• eam :ui!e in!ere!ed in 'a>ing im&ac! an&inning oJ
• Sindice.com no! a c$ear #uine mode
• Sindice ec'no$ogie, on !'e o!'er 'and !recei>ing direc! a!!en!ion from en!er&rie Mou gu; 'and$e B of , can ;ou do !'e a
#e'ind a cor&ora!e !'e Nrewa$$+O
Sindice -ec)nologies )ad t)er o(n life*
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
10/33
Sindice(ec') 2011-!odEnterprise Linked Data Clouds
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
11/33
• See e%ce$$en! Eoog$e in!roduc!ion"
• See a$o
'!!&"//eman!icwe#.com/a!-em!ec'#i-know$edge-gra&'-are-e>er;w'e
*'a! i a Know$edgEra&'+O
http://semanticweb.com/at-semtechbiz-knowledge-graphs-are-everywhere_b37724http://semanticweb.com/at-semtechbiz-knowledge-graphs-are-everywhere_b37724
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
12/33
'e C'am in no(ledgeIntegration
Big a!a
Hadoo&and ?oSG
Smart %
/ogica
are)
olume
elocity:
aria+ility:
C'a$$enge
ariety:
!'e &eed of da!a
c'ema di>eemi!ruc!u
iJeren! co
#uine ne
S/
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
13/33
4nterpriseno(ledge Gr
1&Grap) 5ase7o Sc)ema
2&Add 8rst9 payou go under
and ue $a!er
aria+ility:
ariety:
c'ema di>eri!;emi!ruc!ured da!a
iJeren! con!e%! and
#uine need
A 4 t i K $ d E '
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
14/33
Know$edge Era&'
a#$e
a!a Era&'
eference,Ke; Conce&!,e$a!ion
8%!erna$ omain n!ruc!ured/Semi!ruc!ured con!en!
Cu!om
8nric'men! and 8ncoding>ia omain Dn!o$og;
• Searc'LL• ecomme
• er!ica$ a&• 8%&$ora!i>
e$a!iona$ B
Align
An 4nterprise Know$edge Era&'
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
15/33
9 know$edge reNner; mode
Know$edge *orkQow
*iki&edia
ree#ae L $t)ers
Sc'ema.org D&enEra&'
e%! con!en!around DE/Sc'ema
Sources A#I
S!ruc!ured da!a
S#ARe>plora
0ataodel
L ru!/ro>enance
FSD?/3
FSD?-
Algorit)ms 4>ploration;alidation
e%! in &o!
D&en *e# con!en!
8n!i!; ecogni!ion/Seman!ic if!ing
a$ida!ion/
inking/a!a fuion
8n'ancemen!/eaoning(#; ru$e)
3e!ada!a 8%!rac!ion, a#e$ro&aga!ionC$u!ering/C$aiNca!ion
S!ruc!ureLru!
e%! ana$;!ic
9na$;!ic e!/a$ida!ion 9i!ed Guer;ing
Gira
no(le
i di i+ i d d $
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
16/33
& ranform8%!rac!io
n• re$iminar;
c$eanu&• 0ata odel
• Alignment• 8nric'men!
(Dn!o$og;/u$e)
• Hadoo& ranforma!ion
• ?adoop R0 Summaries• Assisted uerying
• Relational data +ro(ser
• S4SeSe
• ?o• ri
co• Re
5r
8%&$ore
u$$ &roceing
indice contri+utions !o an ad>anced Know$ei&e$ine"
recom&u!e
e$!a &a!'
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
17/33
Era&' Know$edge a!a 3
%lassic Grap) -)eory no(ledge Grap)s
?ode/8dge 1 !;&e , 1 edge 'ouand of !;&e andnode &ro&er!ie
;&ica$:uerie/cenario
3a&&ing di!ance #e!weenen!i!ie, aking 'or!e! &a!',gra&' !ra>era$ e!c
3a&&ing com&$e% re$a!ia$$owing for ar#i!rar; kn9d 'oc/BI kind of :uerie
path
3anagemen! need Eenera$$; im&$e 0istri+uted sources9$ntologies9
;&ica$ oo$ Era&'!ore" ?eo4F, Eira&' R0;-riplestores;Spa
9na$;!ic/rocee 3e!ric/Cen!ra$i!;/C$aiNca!ion/a#e$ &ro&aga!ion e!c..
0ata Grap) Summari
$ t i+ ti ! d d K $ d
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
18/33
& ranform8%!rac!io
n• re$iminar;
c$eanu&• 0ata odel
• Alignment• 8nric'men!
(Dn!o$og;/u$e)
• Hadoo& ranforma!ion
• ?adoop R0 Summaries• Assisted uerying
• Relational data +ro(ser
• S4SeSe
• ?o• ri
co• Re
5r
8%&$ore
u$$ &roceing
$ur contri+ution !o an ad>anced Know$edgi&e$ine"
recom&u!e
e$!a &a!'
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
19/33
8%am&$e Know$edge Era&'Summarie+
9 ma$$ Know$edge Era&' (Ngure 9) and 5 a$!erna!i>e ummar; re&reen!a!ion (#,c,d)Sna&'o! from I 'owing from BBC.CD.K in !'e SinSindice.com gra&' ie, a#
'!!&"//107.17.212.24
http://107.178.212.248/http://107.178.212.248/http://107.178.212.248/
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
20/33
Era&' Summar; -Background• 9 !;&e of ana$;!ic !o e%!rac! !ruc!ura$ &ro&er!ie
• Note: Structure is not just schema! *e a&&$; i! on c'ema+ in our e%am&$e
Can #e a&&$ied on >a$ue+ !'u a c$u!er+ can #e a
• Biimu$a!ion / a!aEuide" !ruc!ura$ summaries fogra&'/emi!ruc!ured da!a
D&!imiing :uer; &roceing a!a e%&$ora!ion / ana$;!ic / #o!!om u& on!o$og; e%
ormu$a!ing meaningfu$ :uerie
• Com&u!a!iona$$; e%&eni>e (A) for $arge gra&'
9 i ! d S9G G
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
21/33
9i!ed S9G Guer8di!or c'
c'
c'
c'ema"9ggrega!ea!ing
c'ema"9ggrega!ea!ing
c'ema"9ggrega!ea!ing
c'ema"3
c'ema"3
c'ema"3
'!!&"//107.17.212.24/free#a
$ur contri+ution !o an ad>anced Know$edg
http://107.178.212.248/freebase/SparqlEditor/http://107.178.212.248/freebase/SparqlEditor/
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
22/33
& ranform8%!rac!io
n• re$iminar;
c$eanu&• 0ata odel
• Alignment• 8nric'men!
(Dn!o$og;/u$e)
• Hadoo& ranforma!ion
• ?adoop R0 Summaries• Assisted uerying
• Relational data +ro(ser
• S4SeSe
• ?o
• rico
• Re5r
8%&$ore
u$$ &roceing
$ur contri+ution !o an ad>anced Know$edgi&e$ine"
recom&u!e
e$!a &a!'
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
23/33
Semi!ruc!ured Searc
• Bridging S!ruc!ured and n!rucda!a
• 9&&$ica!ion e%am&$e" educing :uerie !'a! fa$$ #ack !o
we# earc' ..#; u&&or!ing :uerie !'a! &an
!ruc!ured R un!ruc!ured/!e%!ua$
Sindiceec' Sireemi!ruc!ured
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
24/33
• SI8n &ecia$ie in emi-!ruc!ured earc
ne!ed / mi%ed con!en! FSD?, 3• a!e! R 3o! ca$a#$e earc' me!'od a>
on$; known 'ig' &erformance im&$emen!a
• 9>ai$a#$e for So$r and 8$a!icSearc'
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
25/33
nder !'e 'ood" SI8• In&ired from !ree-$a#e$$ing c'eme !ec'ni:
I) a#e$ eac' node wi!' a 'ierarc'ica$ id ('ere e
iden!iNer)
• u$$-!e%! earc' o&era!or o>er !'e con!en!
• S!ruc!ura$ earc' o&era!or o>er !'e node!ree 9nce!or-ecendan!, aren!-C'i$d, Si#$ing, T
• oi!iona$ earc' o&era!or o>er !'e con!ennode
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
26/33
nder !'e 'ood" reea#e$$ing
name
funding!
rounds
ucidWork
s
round!
code
raised!
amount
T
{
"name" : "LucidWorks",
"category_code" : "analytics",
"funding_rounds" : [
{
"round_code" : "a",
"raised_amount" : 6000000,
"funded_year" : 2009,
…
,
…
!
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
27/33
nder !'e 'ood" reea#e$$ing
{
"name" : "LucidWorks",
"category_code" : "analytics",
"funding_rounds" : [
{
"round_code" : "a",
"raised_amount" : 6000000,
"funded_year" : 2009,
…
,
…
!
name
funding!
rounds
ucidWorks
round!
code
raised!
amount
1.2
1.1
1
1.1.1
1.2.1
1.2.2.1
1.2.2.2
T1.2.2
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
28/33
erformance• Siren B$ock@oin (8$a!icearc'/So$r)
• SD 3 N$e
SI8 ki i M ' S S
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
29/33
SI8n ranking winner Ma'oo SemSeC'a$$enge
• BC da!ae! crea!ed from we# craw$
• Bi$$ion of !ri&$e, Hundred of mi$$ion of en!i!ie• ea$ wor$d 8n!i!; earc' :uerie from Ma'o
• eu$! ac'ie>ed >ia :uer; !ime B323+(A)
SS10 SS11
0
0.0
0.1
0.1
0.2
0.2
0.5
A#
B32
B323A#
$ur contri+ution !o an ad>anced Know$edg
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
30/33
& ranform8%!rac!io
n• re$iminar;
c$eanu&• 0ata odel
• Alignment• 8nric'men!
(Dn!o$og;/u$e)
• Hadoo& ranforma!ion
• ?adoop R0 Summaries• Assisted uerying
• Relational data +ro(ser
• S4SeSe
• ?o
• rico
• Re5r
8%&$ore
u$$ &roceing
$ur contri+ution !o an ad>anced Know$edgi&e$ine"
recom&u!e
e$!a &a!'
In!erac!i>e e$a!iona$ a!
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
31/33
In!erac!i>e e$a!iona$ a!?a>iga!ion
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
32/33
Summar;
• reen!ed reua#$e idea and !ec' deri>ed
Sindice.com Ideas on a #ig da!a/know$edge gra&' &i&e$ine
no(ledge Grap) ana$;!ic (ummar; gra&') a&&$ica!ion
Siren ad>anced earc' engine for mi%ed!ruc!ured/un!ruc!ured da!a
#ivot5ro(ser #ig da!aemi!ruc!ured earcrea$!ime I
' k
8/9/2019 ICSC 2014 Knowledge Graphs Sindice
33/33
'ank ;ou• efu$ $ink
Sindice !or; #$og &o!
'!!&"//eman!icwe#.com/end-u&&or!-indice-com-earc'-engine-'i!or;-$eon-$earned-$eg427U7
Siren '!!&"//irend#.com
9n;25 '!!&"//an;25.a&ac'e.org
• u#$ica!ion . e$#ru, S. Cam&ina, E. ummare$$o. Searc'ing *e# a!a" an 8n!i!; e!rie>a$ and Hig'-erformance In
of *e# Seman!ic, 2011. E. ummare$$o, . C;ganiak, 3. Ca!a!a, S. anie$c;k, . e$#ru, S. ecker. Sig.ma " i>e >iew on !'e *
of *e# Seman!ic, 2010
. e$#ru, E. ummare$$o, 9. o$$ere. Con!e%!-e&enden! D* eaoning in Sindice - 8%&erience and eIn!erna!iona$ Conference on *e# eaoning and u$e S;!em (). 2011.
S. Cam&ina, . 8. err;, . Ceccare$$i, . e$#ru and E. ummare$$o. In!roducing Era&' Summar; *i9i!ed S9G ormu$a!ion. (89). ienna, 2012.=
Cam&ina, . e$#ru, E. ummare$$o. 8Jec!i>e e!rie>a$ 3ode$ for 8n!i!; wi!' 3u$!i-a$ued 9!!ri#u!e" B328K9* 2012.
. C;ganiak, H. S!en'orn, . e$#ru, S. ecker and E. ummare$$o. Seman!ic Si!ema&" 8cien! and $e%a!ae! on !'e Seman!ic *e#. 8S*C. 200.
http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://sirendb.com/http://any23.apache.org/http://any23.apache.org/http://sirendb.com/http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797Recommended