Data Portability with SIOC and FOAF

Preview:

DESCRIPTION

"Data Portability with SIOC and FOAF" talk at XTech 2008 (May 9, 2008, Dublin, Ireland). See: http://2008.xtech.org/public/schedule/detail/565

Citation preview

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

Data Portability with SIOC and FOAF

Uldis Bojārs1, Alexandre Passant2, John Breslin1

1 Digital Enterprise Research Institute, National University of Ireland, Galway2 LaLIC, Université Paris-Sorbonne / Electricité de France R&D

XTech 2008Dublin, Ireland - 2008-05-09

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

What’s the problem and how to solve it ?

3

What if I use multiple services and I want to…

• Merge my social networks between various websites• Invite my friends from a social media website to a new

service I’ve just registered• Move the stuff I have on one service to another (e.g.

move all my blog posts, comments, etc. from WordPress.com to “Acme Blogs”)

• And more:– See my stuff on a third-party service providing an aggregate

view, like FriendFeed, but in an open way– Archive and preserve my social media contributions– …

4

So many social media sites…

* Source: Smashcut Media, www.smashcut-media.com

5

Even more services…

6

It takes a lot of time…

7

Filling out your profiles, re-adding your friends…

8

Uploading posts and content items to “stovepipes”!

9

Social media sites are like data silos

* Source: Pidgin Technologies, www.pidgintech.com

10

Many isolated communities of users and their data

* Source: Pidgin Technologies, www.pidgintech.com

11

Need ways to connect these islands

* Source: Pidgin Technologies, www.pidgintech.com

12

Allowing users to easily move from one to another

* Source: Pidgin Technologies, www.pidgintech.com

13

Enabling users to easily bring their data with them

* Source: Pidgin Technologies, www.pidgintech.com

14

Social networking fatigue

• How many general or niche SNSs are you willing to register and / or interact with?

• “People are getting sick of registering and re-declaring their friends on every site” Brad Fitzpatrick (Aug. 2007)

• Need for a “social graph” with distributed social networks and reusable profiles

• A Bill of Rights for Users of the Social Web (Sept. 07)– Ownership– Control– Freedom

• The Semantic Web can help !

15

is important !

16

Data Portability

• Social network portability– User profiles– Social network relations

• User-created content portability– Social media contributions

• … and all other kinds of data portability needs that you can think of …

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

Social Network Portability with

FOAF (Friend of a Friend)

18

The Semantic Web in brief

• “The Semantic Web is an extension of the current web in which information is given well-defined meaning, better enabling computers and people to work in cooperation” - Tim Berners-Lee, James Hendler, Ora Lassila, Scientific American, May 2001

• A common model to describe data in a machine-readable way:– RDF (Resource Description Framework)

– RDF statements are triples (subject predicate object):XTech2008 isA Conference .

XTech2008 takesPlaceIn Dublin .

Uldis isPresentingAt Xtech2008 .

19

RDF

• Graph-based model for describing:– information about objects– relationships between them

• Different mediums for expressing this knowledge:– RDF in XML, JSON, N3 (plain text)– Embedded: RDFa (in HTML)– Extractable using GRDDL (from Microformats, …)

20

RDF (II)

• Large set of tools & libraries– All major programming languages

• Standard query language– SPARQL (W3C Recommendation)

• Interesting applications built on top of it– SPARQL interfaces to relational DBMSs– Reasoning / inference engines– Web of Linked Data

21

Vocabularies / Ontologies

• Common semantic model for data, using ontologies:– “An ontology is a specification of a conceptualisation” - Tom Gruber

– RDFS (RDF Schema)

– OWL (Web Ontology Language)

• Lightweight Vocabularies are Good for You :)– FOAF, SIOC, DOAP

• The Semantic Web FAQ:– http://www.w3.org/2001/sw/SW-FAQ

22

Lightweight Ontologies

• FOAF– Friend Of A Friend - http://foaf-project.org– Used by Apple Safari, Google Social Graph API, …

• DOAP– Description of A Proect - http://trac.usefulinc.com/doap– Used by Apache projects catalogue, …– Consumed by http://doapstore.org/, …

• SIOC– Semantically-Interlinked Online Communities– http://sioc-project.org– Generated by WordPress and other plugins,

see: http://rdfs.org/sioc/applications/

23

The Semantic Web and Web 2.0

• Semantic Web and Web 2.0 can benefit from each other to lead to a better Web, with social and machine-understandable data.

• Many examples:– Vocabularies: FOAF to describe people, SIOC to describe their data

– Semantic Wikis: Semantic MediaWiki, OntoWiki …

– Revyu.com: A review website based on SW technologies

– Tagging: The Tag Ontology, SCOT, MOAT

• “I think we could have both Semantic Web technology supporting online communities, but at the same time also online communities can support Semantic Web data by being the sources of people voluntarily connecting things together.” Tim Berners-Lee

24

Discovering the “invisible” web of data

• “Semantic Radar” extension for Firefox:– http://sioc-project.org/firefox– Easy to setup and use (Firefox extension, auto-update)– Support for RDFa!– Architecture of participation: just browse the Web– Discover Semantic Web documents using RDF autodiscovery links (a popular

practice for advertising Atom/RSS and FOAF):<head> <link rel="meta" type="application/rdf+xml" title="FOAF"

href="http://example.com/people/~you/foaf.rdf"/></head>

• RDF Browsers:– Tabulator, Disco, Zitgist, …

25

Representing people and their relationships

• FOAF is the main vocabulary used to represent people:– Friend Of A Friend - http://foaf-project.org– foaf:Person class:

• “The foaf:Person class represents people. Something is a foaf:Person if it is a person.”

• Give yourself a URI:– http://apassant.net/alex

• Relationships = foaf:knows property– :John foaf:knows :Alex .

26

Enriching FOAF data

• Extensions using the RELATIONSHIP vocabulary:– http://vocab.org/relationship/– All rel:* properties are subproperties of foaf:knows– :John rel:worksWith :Uldis– RDFS inferencing allows tools to answer queries using

foaf:knows when people use rel:* alternatives

• Can mix together with information in many different vocabularies (as necessary):– DOAP (Description of a Project) - for software projects– SIOC - for user-created content – ….

27

Integrating social networks with FOAF

Common formats,unique URIsSource: Sheila Kinsella, Applications of Social Network Analysis 2007

28

A common semantics for existing services

• Existing FOAF exporters for Facebook, Flickr, Twitter…• Run unified queries using SPARQL

29

Social Network data from multiple sources

• Flickr: http://apassant.net/home/2007/12/flickrdf/data/people/33669349@N00

• Twitter: http://tools.opiumfield.com/twitter/terraces/rdf

• FaceBook:http://ext.dcs.shef.ac.uk/~u0057/FoafGenerator/files/foaf-607513040.rdf

30

Some more FOAF

• LiveJournal

• MyBlogLog– http://kentbrewster.com/foafster/

• CPAN Net::Flickr::Backup– http://search.cpan.org/dist/Net-Flickr-Backup-1.2/lib/Net/Flickr/

Backup.pm

• Pownce

• ….

31

Identity management across networks

• A need to unify URIs from different services so as to represent one's unified identity. Linked-data principles are to use owl:sameAs and rdfs:seeAlso:

• See: http://www4.wiwiss.fu-berlin.de/bizer/pub/LinkedDataTutorial/

– owl:sameAs: Used to identify two resources with different URIs as being the same resource

• :alex owlSameAs flickr:33669349@N00 .– rdfs:seeAlso: “More information about this resource can be found here”, can be

used by Semantic Web tools such as Tabulator

• http://apassant.net/blog/2008/01/12/one-foaf-fits-all/

• Inference using owl:InverseFunctionalProperty:– foaf:mbox, foaf:openid, etc. can be used to identify uniqueness for a foaf:Person

• Unifying aspects of a foaf:Person across networks:– All relevant relationships are related to one foaf:Person– Social Network unification

32

Distributed social networking with FOAF

33

Applications for browsing the social (semantic) graph

• FOAFnaut, FOAF Explorer, etc.• FOAFGear: thanks to common semantics, only 100 lines

of code: http://apassant.net/home/2008/01/foafgear/

34

Semantic social networks tools and services

• Browse / re-use your social graph in personal applications• Merge identities with pre-defined rules• Tools:

– Beatnik– Knowee– SPARQLpress– Nepomuk

35

Combining FOAF and OpenID

• Link to your FOAF profile from your OpenID URL, so that services can get your machine-readable profile when you log-in:

<head><link rel="meta" type="application/rdf+xml" title="FOAF" href="foaf.rdf" /></head>

• FOAF + OpenID Scenario– Bob creates an account on Networkr, a new social networking website,

using OpenID– Networkr retrieves the FOAF URI thanks to an auto-discovery link– From the FOAF file, it identifies if there are any people already subscribed

to Networkr who are listed in Bob’s defined relationships – If that is the case, Bob can add them as “local connections”, share data

with them, etc. without having to once again search for / add his friends– Specific rules:

• If I know X from Flickr, he / she can see my pictures on Networkr

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

Data Portability =

SIOC + FOAF

37

Social Media Contributions

• Lots of user-created content posted on the Web:– Blog posts, wiki pages, bulletin board threads– Called « Social Media Contributions » or SMC

• Distributed content– Blogging platform, photos-sharing website, social bookmarking

service ….

• A need for common semantics to– Provide a single model for any SMC, wherever it comes from– Enables the use of SPARQL queries instead of proprietary APIs– Interlink data and find relationships between content– From documents to resources, from WWW to GGG

38

SIOC

(pron. as “shock”)

39

SIOC

Social Web + Semantic Web

Vocabulary for expressing the Social Web (user-created content)

on the Semantic Web (web of data)

40

http://sioc-project.org

41

Modeling Social Media Contributions

• SIOC - Semantically-Interlinked Social Communities– http://sioc-project.org– A vocabulary to represent the activities of online communities– More than 40 applications, mainly open-source– W3C Member Submission, June 2007

• http://www.w3.org/Submission/2007/02/

• Namespace: http://rdfs.org/sioc/ns– Five top-level classes:

• User / Role / Space / Container / Item

• Items, created by User(s), organized in Containers

42

The SIOC ontology

• The main classes and properties are:

43

Connecting people and their user accounts

• The sioc:User class:– Can be thought of as a virtual representation of any person online,

within the context of a given social media website or community– A subclass of foaf:OnlineAccount

• foaf:holdsAccount property:– “The foaf:holdsAccount property relates a foaf:Agent to a

foaf:OnlineAccount for which they are the sole account holder.”– Links a foaf:Person to various sioc:User(s)– As many sioc:User(s) as required can be linked to a single person– One people, various identities

• Users create and manage content:– has_creator and has_modifier properties– :blogpost123 sioc:has_creator :john

44

A person and their user accounts

45

+ SKOS for topics and categories

• Interlinking using common categories:– Share tags and topics across different content

• SKOS (Simple Knowledge Organisation System):– http://www.w3.org/2004/02/skos/– A vocabulary to describe controlled vocabularies– Used in the “Tag Ontology”: http://www.holygoat.co.

uk/projects/tags/

46

Interlinking content with SKOS

skos:isSubjectOfsioc:topic

47

Interlinking content items

• Can create direct links between instances of sioc:Item:– Link from a blog post to a bulletin board page– sioc:related_to, sioc:links_to

• Interlinking using common categories:– SKOS (Simple Knowledge Organisation System):

• http://www.w3.org/2004/02/skos/

• Interlink using existing URIs as topics:– geonames.org, DBpedia, Revyu– MOAT (Meaning of a Tag) simplifies linking content to such URIs:

• http://moat-project.org/

48

Producing SW data from social media sites

• We can use SIOC as a common framework / data model for expressing social media data on the Semantic Web

• The “SIOC Types” module:– Additional classes and properties for expressing different kinds

of social media / Web 2.0 content– Connects SIOC with domain-specific ontologies

• Reuse existing ontologies:– Dublin Core, FOAF, SKOS, etc.

49

SIOC data producers

• SIOC applications list:– http://rdfs.org/sioc/applications/

• > 20 applications for producing SIOC data:– Free and open source

• SIOC export tools for:– Blogs and forums: WordPress, phpBB, Drupal, b2evolution– “Legacy” applications: mailing lists, IRC– New media: Twitter, Jaiku, Facebook, Flickr– Enterprise applications: CWE (collaborative work environments)

50

Case studies

• WordPress SIOC exporter:– http://sioc-project.org/wordpress– First SIOC plugin created, custom built

• vBulletin and phpBB SIOC exporters:– http://wiki.sioc-project.org/index.php/VBSIOC– http://sioc-project.org/phpbb– Uses SIOC API for PHP

51

Overview of WordPress SIOC exporter

• Installation:– Download from http://sioc-project.org/wordpress– “Drop” two files into the WordPress plugins folder– Go to the administrator’s user interface– Plugins → SIOC Plugin → “Activate”

• SIOC data created for every page:– Data describing all blog posts, comments, users, etc.– SIOC data can be discovered via RDF autodiscovery links:<link rel="meta" type="application/rdf+xml"

title="SIOC" href="http://www.johnbreslin.com/blog/index.php?sioc_type=site" />

• Data can be explored or crawled using existing Semantic Web applications

52

Sample export of SIOC data from WordPress

53

RDF data from the WordPress export plugin, displayed in the SIOC Browser

54

SIOC export APIs

• Benefits:– Hides the complexity from application developers– Can be used by people who are not Semantic Web experts– Automatically updated according to changes in the SIOC ontology and

best practices documents

• Existing SIOC APIs:– Java– Perl (new!)– PHP (most used)– RDFa on Rails

• See “2.1 SIOC APIs” in http://rdfs.org/sioc/applications/

55

Using SIOC and FOAF to represent portable data

56

WordPress SIOC Import plugin

• http://wiki.sioc-project.org/w/SIOC_Import_Plugin– see the wiki for a detailed information

• Download, unzip, activate

• Import some data (Admin -> Options -> SIOC Import)– SIOC data (external) from

http://johnbreslin.com/blog/– SIOC data (embedded RDFa) from

http://www.wikier.org/blog/talk-in-deri#post

57

Moving SIOC data between sites

• Importing SIOC data is easy:– Parse SIOC RDF data (e.g. using ARC2 or RAP for PHP)– Convert SIOC data to the content model of the target system:

• e.g. content and other properties of blog posts and comments

– Store data in the target application:• The most difficult part • Wordpress plugin

58

QuickTime™ and aTIFF (Uncompressed) decompressor

are needed to see this picture.

59

How can you use and extend it?

• Horizontal scaling– Import / copy / move whole sites

• More intelligent import– Matching categories, users to those already on the site

• Combine with messaging / pubsub– Use RDF (SIOC+FOAF+…) for data transferred via XMPP

• How would you use it?– This demo was just a proof-of-concept– More exciting use cases are ahead

Copyright 2008 Digital Enterprise Research Institute. All rights reserved.

www.deri.ie

Summary

61

Summary

• Data portability is an important issue– Many networks, many friends, many contributions– Distributed in proprietary data silos

• Open up the silos, open your data– Use standards API and languages to get your data

• SIOC + FOAF = a part of the Data Portability toolbox– Combination of open, lightweight data formats– Extensible with domain specific ontologies– Linked Data

62

Future Events

• Scripting for the Semantic Web (SFSW 2008) – Workshop with many interesting contributions– Full papers already available online

• E.g., “Microblogging: A Semantic Web and Distributed Approach”

– Date / Location: June 2008 - Tenerife, Spain– http://www.semanticscripting.org/SFSW2008/

• Social Data on the Web (SDoW 2008)– Workshop combining the Social Web and Semantic Web– Awaiting contributions, paper submission open– Date / Location: October 2008 - Karlsruhe, Germany– http://sdow2008.semanticweb.org/

63

Towards a Social Semantic Web !

• Vocabularies and tools (APIs, producers…) already exist– http://esw.w3.org/topic/SemanticWebTools– http://sioc-project.org/applications– There must be something you can use

• Join us and contribute !– http://sioc-project.org– #sioc on irc.freenode.net– sioc-dev on google-groups

http://groups.google.com/group/sioc-dev

64

Contacts

• Uldis Bojārs– uldis.bojars@deri.org // http://captsolo.net/info/

• Alexandre Passant– alex@passant.org // http://apassant.net

• John Breslin– john.breslin@deri.org // http://johnbreslin.com

• Thanks to:– Dan Brickley (FOAF) for his valuable comments– Tuuka Hastrup and Thomas Schandl for work on

evolving the SIOC Import plugin