ICSC 2014 Knowledge Graphs Sindice

Embed Size (px)

Citation preview

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    1/33

    ICSC 2014

    16/6/2014

    Enterprise “Knowledge Graphs“when "Web of Data" technologies make a lot of sense

    in business scenarios.

    Dr. Giovanni Tummarello

    FBK- Italy

    Sindicetech - Ireland

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    2/33

    BackgroundSindice.com (2007-end

    2011)Pushing search engine tech to the W

    Data

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    3/33

    How we !ar!ed"Sindice.com• Goal: #ui$ding and e%&eCen!ra$ied Seman!ic *

    • Started wi!' 1 mac'ine

    200.• Stopped craw$ing 20127003 &age, 50B !ri&$

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    4/33

    Sindice.com &i&e$ine ke

    Craw$er

    Si!ema&ogic

    ing+

    a!aS!aging

    8%!rac!ion

    8n'ancemen!

    HBae

    9na$;!ic

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    5/33

     ec'nica$ mi$e!one(Sindice.com)  *e# of a!a ac:uii!ion• Seman!ic Si!ema& erage of !radi!iona$

    i!ema&

    • S!ar!ed and o&en-ourced Any23.org (?ow an 9&ac'e !o&$e>e$ &ro@ec!A)

      8nric'men!/*e# Sca$e eaoning• u$$ reaoning ma!eria$ied #efore inde%ing, reaoning done

    fe!c'ing &ro&er!ie and on!o$ogie on$ine, &ar!ia$ reaoningreu$! reued made i! ecien!, 'adoo& dri>en

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    6/33

    Sindice.com for da!awe#ma!er+ (1)

    owered #; Hadoo& ana$;!ic, Sindice *'ow da!a a a w'o$e &er we#i!e. 8m&in!erconnec!ion #e!ween en!i!ie #o!' wion diJeren! &age) and acro we#i!e/dSummar; gra&'+ i 600L3 !ri&$e i!e$f.

    "//demo.indice.ne!/da!ae!/

    Indi>idua$ &age marku& in&ec!or u$$ we#i!e marku&

    http://demo.sindice.net/dataset/http://demo.sindice.net/dataset/

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    7/33

    Sindice.com for da!awe#ma!er+ (2)

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    8/33

    • Sindice Site Services!"#$%&:#ut simple mar'up  Get data po(ered(idgets and site searc)*

    • Sig.maSindice +ased mas)ups on t)e ,y.

    Sindice.com ome a&&$ica

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    9/33

    8nd of 2011• Sc'ema.org #ooming

    • *e# of a!a $aunc'ed+ a main!ream

    •  eam :ui!e in!ere!ed in 'a>ing im&ac! an&inning oJ 

    • Sindice.com no! a c$ear #uine mode

    • Sindice ec'no$ogie, on !'e o!'er 'and !recei>ing direc! a!!en!ion from en!er&rie  Mou gu; 'and$e B of , can ;ou do !'e a

    #e'ind a cor&ora!e !'e Nrewa$$+O

      Sindice -ec)nologies )ad t)er o(n life*

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    10/33

    Sindice(ec') 2011-!odEnterprise Linked Data Clouds

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    11/33

    • See e%ce$$en! Eoog$e in!roduc!ion"

    • See a$o

    '!!&"//eman!icwe#.com/a!-em!ec'#i-know$edge-gra&'-are-e>er;w'e

    *'a! i a Know$edgEra&'+O

    http://semanticweb.com/at-semtechbiz-knowledge-graphs-are-everywhere_b37724http://semanticweb.com/at-semtechbiz-knowledge-graphs-are-everywhere_b37724

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    12/33

     'e C'am in no(ledgeIntegration

    Big a!a

    Hadoo&and ?oSG

    Smart %

    /ogica

    are)

    olume

    elocity:

    aria+ility:

    C'a$$enge

    ariety:

    !'e &eed of da!a

    c'ema di>eemi!ruc!u

    iJeren! co

    #uine ne

    S/

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    13/33

    4nterpriseno(ledge Gr

    1&Grap) 5ase7o Sc)ema

    2&Add 8rst9 payou go under

    and ue $a!er

    aria+ility:

    ariety:

    c'ema di>eri!;emi!ruc!ured da!a

    iJeren! con!e%! and

    #uine need

    A 4 t i K $ d E '

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    14/33

    Know$edge Era&'

     a#$e

    a!a Era&'

    eference,Ke; Conce&!,e$a!ion

    8%!erna$ omain n!ruc!ured/Semi!ruc!ured con!en!

    Cu!om

    8nric'men! and 8ncoding>ia omain Dn!o$og;

    • Searc'LL• ecomme

    • er!ica$ a&• 8%&$ora!i>

    e$a!iona$ B

     Align

    An 4nterprise Know$edge Era&'

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    15/33

    9 know$edge reNner; mode

    Know$edge *orkQow

    *iki&edia

    ree#ae L $t)ers

    Sc'ema.org  D&enEra&'

     e%! con!en!around DE/Sc'ema

    Sources A#I

    S!ruc!ured da!a

    S#ARe>plora

    0ataodel

    L ru!/ro>enance

     FSD?/3

     FSD?-

    Algorit)ms 4>ploration;alidation

     e%! in &o!

    D&en *e# con!en!

    8n!i!; ecogni!ion/Seman!ic if!ing

    a$ida!ion/

    inking/a!a fuion

    8n'ancemen!/eaoning(#; ru$e)

    3e!ada!a 8%!rac!ion, a#e$ro&aga!ionC$u!ering/C$aiNca!ion

    S!ruc!ureLru!

     e%! ana$;!ic

    9na$;!ic e!/a$ida!ion 9i!ed Guer;ing

    Gira

    no(le

    i di i+ i d d $

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    16/33

    & ranform8%!rac!io

    n• re$iminar;

    c$eanu&• 0ata odel

    • Alignment• 8nric'men!

    (Dn!o$og;/u$e)

    • Hadoo& ranforma!ion

    • ?adoop R0 Summaries• Assisted uerying

    • Relational data +ro(ser

    • S4SeSe

    • ?o•  ri

    co• Re

    5r

    8%&$ore

    u$$ &roceing

    indice contri+utions !o an ad>anced Know$ei&e$ine"

    recom&u!e

    e$!a &a!'

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    17/33

    Era&' Know$edge a!a 3

    %lassic Grap) -)eory no(ledge Grap)s

    ?ode/8dge 1 !;&e , 1 edge 'ouand of !;&e andnode &ro&er!ie

     ;&ica$:uerie/cenario

    3a&&ing di!ance #e!weenen!i!ie, aking 'or!e! &a!',gra&' !ra>era$ e!c

    3a&&ing com&$e% re$a!ia$$owing for ar#i!rar; kn9d 'oc/BI kind of :uerie

     path

    3anagemen! need Eenera$$; im&$e 0istri+uted sources9$ntologies9

     ;&ica$ oo$ Era&'!ore" ?eo4F, Eira&' R0;-riplestores;Spa

    9na$;!ic/rocee 3e!ric/Cen!ra$i!;/C$aiNca!ion/a#e$ &ro&aga!ion e!c..

    0ata Grap) Summari

    $ t i+ ti ! d d K $ d

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    18/33

    & ranform8%!rac!io

    n• re$iminar;

    c$eanu&• 0ata odel

    • Alignment• 8nric'men!

    (Dn!o$og;/u$e)

    • Hadoo& ranforma!ion

    • ?adoop R0 Summaries• Assisted uerying

    • Relational data +ro(ser

    • S4SeSe

    • ?o•  ri

    co• Re

    5r

    8%&$ore

    u$$ &roceing

    $ur contri+ution !o an ad>anced Know$edgi&e$ine"

    recom&u!e

    e$!a &a!'

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    19/33

    8%am&$e Know$edge Era&'Summarie+

    9 ma$$ Know$edge Era&' (Ngure 9) and 5 a$!erna!i>e ummar; re&reen!a!ion (#,c,d)Sna&'o! from I 'owing from BBC.CD.K in !'e SinSindice.com gra&' ie, a#

    '!!&"//107.17.212.24

    http://107.178.212.248/http://107.178.212.248/http://107.178.212.248/

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    20/33

    Era&' Summar; -Background• 9 !;&e of ana$;!ic !o e%!rac! !ruc!ura$ &ro&er!ie

    • Note: Structure is not just schema!  *e a&&$; i! on c'ema+ in our e%am&$e

      Can #e a&&$ied on >a$ue+ !'u a c$u!er+ can #e a

    • Biimu$a!ion / a!aEuide" !ruc!ura$ summaries fogra&'/emi!ruc!ured da!a

      D&!imiing :uer; &roceing  a!a e%&$ora!ion / ana$;!ic / #o!!om u& on!o$og; e%

      ormu$a!ing meaningfu$ :uerie

    • Com&u!a!iona$$; e%&eni>e (A) for $arge gra&'

    9 i ! d S9G G

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    21/33

    9i!ed S9G Guer8di!or c'

    c'

    c'

    c'ema"9ggrega!ea!ing

    c'ema"9ggrega!ea!ing

    c'ema"9ggrega!ea!ing

    c'ema"3

    c'ema"3

    c'ema"3

    '!!&"//107.17.212.24/free#a

    $ur contri+ution !o an ad>anced Know$edg

    http://107.178.212.248/freebase/SparqlEditor/http://107.178.212.248/freebase/SparqlEditor/

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    22/33

    & ranform8%!rac!io

    n• re$iminar;

    c$eanu&• 0ata odel

    • Alignment• 8nric'men!

    (Dn!o$og;/u$e)

    • Hadoo& ranforma!ion

    • ?adoop R0 Summaries• Assisted uerying

    • Relational data +ro(ser

    • S4SeSe

    • ?o

    •  rico

    • Re5r

    8%&$ore

    u$$ &roceing

    $ur contri+ution !o an ad>anced Know$edgi&e$ine"

    recom&u!e

    e$!a &a!'

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    23/33

    Semi!ruc!ured Searc

    • Bridging S!ruc!ured and n!rucda!a

    • 9&&$ica!ion e%am&$e"  educing :uerie !'a! fa$$ #ack !o

    we# earc'  ..#; u&&or!ing :uerie !'a! &an

    !ruc!ured R un!ruc!ured/!e%!ua$

    Sindiceec' Sireemi!ruc!ured

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    24/33

    • SI8n &ecia$ie in emi-!ruc!ured earc

    ne!ed / mi%ed con!en!  FSD?, 3• a!e! R 3o! ca$a#$e earc' me!'od a>

    on$; known 'ig' &erformance im&$emen!a

    • 9>ai$a#$e for So$r and 8$a!icSearc'

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    25/33

    nder !'e 'ood" SI8• In&ired from !ree-$a#e$$ing c'eme !ec'ni:

    I)  a#e$ eac' node wi!' a 'ierarc'ica$ id ('ere e

    iden!iNer)

    • u$$-!e%! earc' o&era!or o>er !'e con!en!

    • S!ruc!ura$ earc' o&era!or o>er !'e node!ree  9nce!or-ecendan!, aren!-C'i$d, Si#$ing, T

    • oi!iona$ earc' o&era!or o>er !'e con!ennode

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    26/33

    nder !'e 'ood" reea#e$$ing

    name

    funding! 

    rounds

    ucidWork

    s

    round!

    code

    raised!

    amount

    T

    {

      "name" : "LucidWorks",

      "category_code" : "analytics",

      "funding_rounds" : [

      {

      "round_code" : "a",

      "raised_amount" : 6000000,

      "funded_year" : 2009,

      …

      ,

      …

      ! 

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    27/33

    nder !'e 'ood" reea#e$$ing

    {

      "name" : "LucidWorks",

      "category_code" : "analytics",

      "funding_rounds" : [

      {

      "round_code" : "a",

      "raised_amount" : 6000000,

      "funded_year" : 2009,

      …

      ,

      …

      ! 

    name

    funding! 

    rounds

    ucidWorks

    round! 

    code

    raised! 

    amount

    1.2

    1.1

    1

    1.1.1

    1.2.1

    1.2.2.1

    1.2.2.2

    T1.2.2

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    28/33

    erformance• Siren B$ock@oin (8$a!icearc'/So$r)

    • SD 3 N$e

    SI8 ki i M ' S S

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    29/33

    SI8n ranking winner Ma'oo SemSeC'a$$enge

    • BC da!ae! crea!ed from we# craw$

    • Bi$$ion of !ri&$e, Hundred of mi$$ion of en!i!ie• ea$ wor$d 8n!i!; earc' :uerie from Ma'o

    • eu$! ac'ie>ed >ia :uer; !ime B323+(A)

    SS10 SS11

    0

    0.0

    0.1

    0.1

    0.2

    0.2

    0.5

    A#

    B32

    B323A#

    $ur contri+ution !o an ad>anced Know$edg

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    30/33

    & ranform8%!rac!io

    n• re$iminar;

    c$eanu&• 0ata odel

    • Alignment• 8nric'men!

    (Dn!o$og;/u$e)

    • Hadoo& ranforma!ion

    • ?adoop R0 Summaries• Assisted uerying

    • Relational data +ro(ser

    • S4SeSe

    • ?o

    •  rico

    • Re5r

    8%&$ore

    u$$ &roceing

    $ur contri+ution !o an ad>anced Know$edgi&e$ine"

    recom&u!e

    e$!a &a!'

    In!erac!i>e e$a!iona$ a!

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    31/33

    In!erac!i>e e$a!iona$ a!?a>iga!ion

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    32/33

    Summar;

    • reen!ed reua#$e idea and !ec' deri>ed

    Sindice.com  Ideas on a #ig da!a/know$edge gra&' &i&e$ine

      no(ledge Grap) ana$;!ic (ummar; gra&') a&&$ica!ion

      Siren ad>anced earc' engine for mi%ed!ruc!ured/un!ruc!ured da!a

      #ivot5ro(ser #ig da!aemi!ruc!ured earcrea$!ime I

    ' k

  • 8/9/2019 ICSC 2014 Knowledge Graphs Sindice

    33/33

     'ank ;ou• efu$ $ink

      Sindice !or; #$og &o!

    '!!&"//eman!icwe#.com/end-u&&or!-indice-com-earc'-engine-'i!or;-$eon-$earned-$eg427U7

      Siren '!!&"//irend#.com

      9n;25 '!!&"//an;25.a&ac'e.org

    • u#$ica!ion  . e$#ru, S. Cam&ina, E. ummare$$o. Searc'ing *e# a!a" an 8n!i!; e!rie>a$ and Hig'-erformance In

    of *e# Seman!ic, 2011.  E. ummare$$o, . C;ganiak, 3. Ca!a!a, S. anie$c;k, . e$#ru, S. ecker. Sig.ma " i>e >iew on !'e *

    of *e# Seman!ic, 2010

      . e$#ru, E. ummare$$o, 9. o$$ere. Con!e%!-e&enden! D* eaoning in Sindice - 8%&erience and eIn!erna!iona$ Conference on *e# eaoning and u$e S;!em (). 2011.

      S. Cam&ina, . 8. err;, . Ceccare$$i, . e$#ru and E. ummare$$o. In!roducing Era&' Summar; *i9i!ed S9G ormu$a!ion. (89). ienna, 2012.=

      Cam&ina, . e$#ru, E. ummare$$o. 8Jec!i>e e!rie>a$ 3ode$ for 8n!i!; wi!' 3u$!i-a$ued 9!!ri#u!e" B328K9* 2012.

      . C;ganiak, H. S!en'orn, . e$#ru, S. ecker and E. ummare$$o. Seman!ic Si!ema&" 8cien! and $e%a!ae! on !'e Seman!ic *e#. 8S*C. 200.

    http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://sirendb.com/http://any23.apache.org/http://any23.apache.org/http://sirendb.com/http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797http://semanticweb.com/end-support-sindice-com-search-engine-history-lessons-learned-legacy-guest-post_b42797