Exposing Digital Content as Linked Data,and Linking them using StoryBlink
Ben De MeesterTom De Nies, Laurens De Vocht,
Ruben Verborgh, Erik Mannens, and Rik Van de Walle
University Ghent – iMinds – Multimedia [email protected] | @Ben__DM
NLPDBpedia2015@ISWC | October 11th 2015 | Bethlehem, PA
We live in a fast worldwith a lot of content to sift through
http://blog.qmee.com/qmee-online-in-60-seconds/
Book ≠ Fast
Finding a good book in short time?
Recommendations!
Recommendations?
Social recommendationsLong tail
Metadata recommendationsManual?
What do we want?
Automatic content-based metadata
to fuel future recommendation-engines
Content-based metadata
Get the tags…DBPedia Spotlight
... use them to represent books’ content …EPUB CFI, NIF, ITS, …
… and link to other books … in a good way.TPF, EiCE
Storyblink!
Get the tags
Find out what a book is about…
Semantic tags!
Using NER/NED!
Extract all semantic concepts from the book
AGDISTIS
AGDISTIS
Open source
Local
NER/NED/NEL
From a book to a semantic book
… …
Split HTML into chunks
HTMLto text
Local Spotlight
Represent a book by tags@prefix schema: <http://schema.org/> .@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .@prefix dbr: <http://dbpedia.org/resource/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix pg84: <http://www.gutenberg.org/ebooks/84.epub#> .
pg84:book a schema:Book .
pg84:epubcfi(/6/12!/4/2/4) itsrdf:taIdentRef dbr:Chamois ; nif:sourceUrl pg84:book .pg84:epubcfi(/6/2!/4/46[chap01]/16/42) itsrdf:taIdentRef dbr:Chamois ; nif:sourceUrl pg84:book . pg84:epubcfi(/6/12!/4/2/6) itsrdf:taIdentRef dbr:Desert ; nif:sourceUrl pg84:book .
...
Represent a book by tags@prefix schema: <http://schema.org/> .@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .@prefix dbr: <http://dbpedia.org/resource/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> . @prefix pg84: <http://www.gutenberg.org/ebooks/84.epub#> .
pg84:book a schema:Book .
pg84:epubcfi(/6/12!/4/2/4) itsrdf:taIdentRef dbr:Chamois ; nif:sourceUrl pg84:book .pg84:epubcfi(/6/2!/4/46[chap01]/16/42) itsrdf:taIdentRef dbr:Chamois ; nif:sourceUrl pg84:book . pg84:epubcfi(/6/12!/4/2/6) itsrdf:taIdentRef dbr:Desert ; nif:sourceUrl pg84:book .
...
Link to other books
Open Source
Linked data path finding
Multiple paths
Keeping all concepts…
Not all mentioned concepts are useful.
The path finding becomes really slow.
Keeping all concepts…
Not all mentioned concepts are useful.
The path finding becomes really slow.
What happens if we keep the top X%?
0 10 20 30 40 50 60 70 80 90 1000
2
4
6
8
10
12
14
0
10000
20000
30000
40000
50000
60000
Amount of considered concepts (%)
#paths Time (s)
Top 50% of found concepts gives similar paths,but a lot faster
0 10 20 30 40 50 60 70 80 90 1000
2
4
6
8
10
12
14
0
10000
20000
30000
40000
50000
60000
Amount of considered concepts (%)
#paths Time (s)
Top 50% of found concepts gives similar paths,but a lot faster
Time-out
Optimized Results@prefix schema: <http://schema.org/> .@prefix nif: <http://persistence.uni-leipzig.org/nlp2rdf/ontologies/nif-core#> .@prefix itsrdf: <http://www.w3.org/2005/11/its/rdf#> .@prefix dbr: <http://dbpedia.org/resource/> .@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .@prefix pg84: <http://www.gutenberg.org/ebooks/84.epub#> . pg84:book a schema:Book .
pg84:book itsrdf:taIdentRef dbr:Chamois, dbr:Desert, ...
http://uvdt.test.iminds.be/storyblinkdata/books
Storyblink
Exploring the links between classic works
Choose two books, and…
Storyblink
Next steps
Scale
Indirect pathse.g. book about WWI and book about WWII
Relevancy measuresKnowledge base influenceFiltering influence
Storyblinkgives a semantic representationof important semantic concepts
inside books, and uses those to connect books together content-wise
http://uvdt.test.iminds.be/storyblink
Demo 48
Our project
The Publisher of the Future
Our pilot project partners: