38
A Semantic Web Content Model and Repository Max Völkel 6.9.2007 I-Semantics, Graz

A Semantic Web Content Model and Repository Max Völkel 6.9.2007 I-Semantics, Graz

  • View
    221

  • Download
    1

Embed Size (px)

Citation preview

A Semantic Web Content Model and Repository

Max Völkel

6.9.2007I-Semantics,

Graz

© 2007 Max [email protected]

18.04.23 08:03 nachm. 2/38

Outline

Motivation

Analysis: Web vs. Semantic Web

Developing a unified Semantic Web Content Model In three easy steps

Implementation

© 2007 Max [email protected]

18.04.23 08:03 nachm. 3/38

How to model structure + content in one model?

Background Wikis, Personal Semantic Wikis, Semantic Desktop, …

Two motivations: Bring flexibility and expressivity of RDF to the end-user Allow RDF to model and represent content as well – not only its

metadata

Goal: Unified Model As usable as the web

But how to represent semantics? Semantic queries? As expressive and as flexible as the semantic web

How to represent binary data (desktop files, web resources) in RDF?

Unified search “Give me all papers written by author X which contain Y”

© 2007 Max [email protected]

18.04.23 08:03 nachm. 4/38

Analysis: The Web

Granularity: Gets smaller Web 1.0: homepages, portals Web 2.0: micro-content

Renderable representations

Freedom of formalisation Less semantic HTML is less

portable, but works

HTTP

HTTPHTML,

JPG, CSS, JS, PDF, …

Content

Representation

URI

ChangeDate

MimeType

Encoding

meta-data

© 2007 Max [email protected]

18.04.23 08:03 nachm. 5/38

Analysis: The Semantic Web

Flexible, very expressive

Not expressive enough: Literals cannot be addressed Statements cannot be addressed (but

reified)

10 different node types complex for end-users

Exising formal knowledge can be re-used

© 2007 Max [email protected]

18.04.23 08:03 nachm. 6/38

Content granularity

Expressivity

Binary Content

Freedom of formalisation

Human-usable Renderable representations Human-type- and memorizable names (e.g. like WikiWords) Inverse Relations

Knowledge re-use

Standard CMS features: Access rights addressable parts Versioning addressable parts

Requirements for a SWCM

© 2007 Max [email protected]

18.04.23 08:03 nachm. 7/38

Feature Web Sem. WebDesired

Content granularity mid/large small any Goal: From small comments to full web pages/files

Expressivity - + ++

Binary Content ++ ~ ++

Freedom of formalisation + - +

Human-usable ++ - ++ Renderable representations Human-type- and memorizable names (e.g. like WikiWords) Inverse Relations

Knowledge re-use - ++ +

Standard CMS features: Access rights addressable parts + ~

+ Versioning addressable parts + ~

+

Comparison

© 2007 Max [email protected]

18.04.23 08:03 nachm. 8/38

Creating the SWCM:

Step 1: A Human-Usable RDF

1

© 2007 Max [email protected]

18.04.23 08:03 nachm. 9/38

Step 1: A Human-Usable RDF

Items have a URI and can have a Literal Addressable Literals

Item0..1

URI Literal

© 2007 Max [email protected]

18.04.23 08:03 nachm. 10/38

Step 1: A Human-Usable RDF

Statements connect Items Expressivity of RDF

Item

Statement

source

targ

et

0..1

URI

relation

Literal

© 2007 Max [email protected]

18.04.23 08:03 nachm. 11/38

Step 1: A Human-Usable RDF

Addressable Statements Syntactic sugar over reification

Item

Statement

source

targ

et

0..1

URI

relation

Literal

© 2007 Max [email protected]

18.04.23 08:03 nachm. 12/38

Step 1: A Human-Usable RDF

Address Items via human-type-able name (e.g. WikiWords) Human-usable naming

Item

Statement NameItem

source

targ

et

0..1

URI

relation

Literal

© 2007 Max [email protected]

18.04.23 08:03 nachm. 13/38

Step 1: A Human-Usable RDF

Statements (Item, NameItem, Item) Decision that relations should be human-name-able

Item

Statement NameItem

source

targ

et

0..1

URI

relation

Literal

© 2007 Max [email protected]

18.04.23 08:03 nachm. 14/38

Step 1: A Human-Usable RDF

Relations have always an inverse Item-centric rendering easier for tools

Item

Statement

Relation

NameItem

source

targ

et

inverse

0..1

URI

relation

Literal

© 2007 Max [email protected]

18.04.23 08:03 nachm. 15/38

Step 1: A Human-Usable RDF

A Model contains Items

A Model has a URI

Item

Model

Statement

Relation

NameItem

source

targ

et

inverse

0..1

URI0..n

relation

Literal

© 2007 Max [email protected]

18.04.23 08:03 nachm. 16/38

Creating the SWCM:

Step 2: Include Binary Content

2

© 2007 Max [email protected]

18.04.23 08:03 nachm. 17/38

Step 2: Include Binary Content

From addressable literals to addressable representations

Literal

Item0..1

URI

© 2007 Max [email protected]

18.04.23 08:03 nachm. 18/38

Step 2: Include Binary Content

Representation

Item0..1

URI

© 2007 Max [email protected]

18.04.23 08:03 nachm. 19/38

Step 2: Include Binary Content

Representations on the web have some built-in properties Metadata: Mime-type, encoding, change-date Data: the actual content itself

Representation

Item0..1

URI

© 2007 Max [email protected]

18.04.23 08:03 nachm. 20/38

Step 2: Include Binary Content

Content

Representation

ChangeDate

MimeType

Encoding

Item0..1

URI

Representations on the web have some built-in properties Metadata: Mime-type, encoding, change-date Data: the actual content itself

© 2007 Max [email protected]

18.04.23 08:03 nachm. 21/38

Step 2: Include Binary Content

In SWCM, representations have an author Like in wikis, blogs, web pages, … Can be „anonymous“

Content

Representation

ChangeDate

MimeType

Encoding

Item0..1

URI

© 2007 Max [email protected]

18.04.23 08:03 nachm. 22/38

Step 2: Include Binary Content

In SWCM, representations have an author Like in wikis, blogs, web pages, … Can be „anonymous“

author

Content

Representation

ChangeDate

MimeType

Encoding

Item0..1

URI

© 2007 Max [email protected]

18.04.23 08:03 nachm. 23/38

Creating the SWCM:

Step 3: Merge Step 1 and Step 2

3

© 2007 Max [email protected]

18.04.23 08:03 nachm. 24/38

The Semantic Web Content Model

author

Content

Representation

ChangeDate

MimeType

Encoding

Item

Model

Statement

Relation

NameItem

source

targ

et

inverse

0..1

URI0..n

relation

Structure Content

© 2007 Max [email protected]

18.04.23 08:03 nachm. 25/38

The Semantic Web Content Model

We expect end-users to understand the circled parts

author

Content

Representation

ChangeDate

MimeType

Encoding

Item

Model

Statement

Relation

NameItem

source

targ

et

inverse

0..1

URI0..n

relation

Structure Content

© 2007 Max [email protected]

18.04.23 08:03 nachm. 26/38

Implementation

© 2007 Max [email protected]

18.04.23 08:03 nachm. 27/38

Swecr is implemented in two layers

swecr.core interface

swecr.model interface

© 2007 Max [email protected]

18.04.23 08:03 nachm. 28/38

The swecr.model API (see www.swecr.org)

IItemINameContent

IModel

IStatement

IContent

IRelaton

INameItem

source

targ

et

inverse

0..1

IMimeTypeRDF2Go.URI

0..n

IRepository 0..n

author

1. Content of a INameItem is unique within its

IModel.2. Mimetype always =

„text/plain“

ChangeDate

IContentItem

IBinContent

www.swecr.org

© 2007 Max [email protected]

18.04.23 08:03 nachm. 29/38

swecr.core

© 2007 Max [email protected]

18.04.23 08:03 nachm. 30/38

Swecr.core: Some Content stored in RDF

:FZI a swcm:NameItem , swcm:Item ;swcm:hasChangeDate "2007-08-24T16:07:29Z"^^xsd:dateTime ;swcm:hasContent “FZI Forschungszentrum Informatik" .

:employs a swcm:NameItem , swcm:Item , swcm:Relation ;swcm:hasAuthor swcm:anonymous-author ;swcm:hasChangeDate "2007-08-24T16:07:32Z"^^xsd:dateTime ;swcm:hasContent “employs" ;swcm:hasInverse :employedBy .

:worksFor a swcm:NameItem , swcm:Item , swcm:Relation ;swcm:hasAuthor swcm:anonymous-author ;swcm:hasChangeDate "2007-08-24T16:07:33Z"^^xsd:dateTime ;swcm:hasContent “works for" ;swcm:hasInverse :employs .

www.swecr.org

© 2007 Max [email protected]

18.04.23 08:03 nachm. 31/38

Implemented in two layers

<urn:rnd:-1d72b0a2:11498a0d25f:-7fff>a swcm:Item , swcm:Statement ;swcm:hasChangeDate "2007-08-24T16:07:30Z"^^xsd:dateTime ;swcm:stmtRelation :employs ;swcm:stmtSource :FZI ;swcm:stmtTarget :Max .

www.swecr.org

Statements stored in two RDF models: user model

and index model Query answering

:FZI :employs :Max .

:Max :worksFor :FZI .redundant

© 2007 Max [email protected]

18.04.23 08:03 nachm. 32/38

But where to store binaries?

?RDF ModelSet

ModelSetImpl

swecr.core interface

swecr.model interface

user model index model

www.swecr.org

© 2007 Max [email protected]

18.04.23 08:03 nachm. 33/38

BinStore – a simple binary store

Intuition: The simplest web-like API, that would possibly work (and allow random-access)

Data model:URI Metadata + InputStream /

OutputStream

Simple implementation on files Future: Consider JCR

www.swecr.org

getReadHandle InputStream readStream(); getMimeType(), getSize()

getWriteHandle writeStream( InputStream,

MimeType ) setMimeType( MimeType )

getRandomAccessHandle

delete( URI )

Binary Store

BinStoreImpl

API

© 2007 Max [email protected]

18.04.23 08:03 nachm. 34/38

Persistence in an RDF ModelSet and a Binary Store

Binary Store

RDF ModelSet

ModelSetImpl

BinStoreImpl

swecr.core interface

swecr.model interface

www.swecr.org

user model index model

Full text queries need a full text index

© 2007 Max [email protected]

18.04.23 08:03 nachm. 35/38

The complete swecr.core

www.swecr.org

Binary Store

RDF ModelSet

Query Engine

IndexingBinStore

IndexingModelSet

BinStoreImpl

TextIndexImpl

ModelSetImpl

AdapterServer

Bin2Text(Aperture

)

swecr.core interface

swecr.model interface

Existing component In progress Download from www.swecr.org

© 2007 Max [email protected]

18.04.23 08:03 nachm. 36/38

Example: Wiki-page

Example: A wiki-page in SWCM Title of wiki page NameItem Content of wiki page Item Relation between title and page content Statement

Who uses it? WavesWiki (part of BMBF project, http://waves.fzi.de) SemFS a Semantic File System (presented at I-KNOW in 2006) Conceptual Data Structures (end-user personal KM tool) Interest from XWiki and Cognium Systems

© 2007 Max [email protected]

18.04.23 08:03 nachm. 37/38

Summary

SWCM is a content management model combining the usability of the web with the expressivity and flexibility of the semantic web

Item

Statement

Relation

NameItem

source

targ

et

inverse

relationFuture Work:

Refactor core layer into smaller parts (services)

Create RDF with binaries – API

Unified queries (like the LuceneSAIL or LARQ)

Crawling of external resources (index localled, stored remote)

From structured text to SWCM models (see paper)

Thank You.