53
Social Web 2016 Lecture 3 : What DATA looks like on the Social Web? Davide Ceolin (credits to: Lora Aroyo) The Network Institute VU University Amsterdam

VU University Amsterdam - The Social Web 2016 - Lecture 3

Embed Size (px)

Citation preview

Page 1: VU University Amsterdam - The Social Web 2016 - Lecture 3

Social Web2016

Lecture 3: What DATA looks like on the Social Web?

Davide Ceolin (credits to: Lora Aroyo)The Network Institute

VU University Amsterdam

Page 2: VU University Amsterdam - The Social Web 2016 - Lecture 3

PSA - Assignment 1• Your own vision based on your analysis of what are the prime privacy-

related issues & initiatives on the (Social) Web.

• Summarise all the legal contexts for privacy & ownership.

• Compare initiatives according to their advantages & disadvantages. Include also your own advise to policy makers (position).

• Write for people who didn’t attend the course

• Attach all the mind maps from lecture 1 and 2.

• All visuals, e.g. screenshots, diagrams should be included in appendix

• Submit only 1 file in PDF

Social Web 2016, Davide Ceolin

Page 3: VU University Amsterdam - The Social Web 2016 - Lecture 3

PSA - Assignment 1

• Grading:

• Part I - 35%

• Part 2 - 55%

• Preparatory assignments - 10%

Page 4: VU University Amsterdam - The Social Web 2016 - Lecture 3

PSA - Final Assignment

• Final assignment

• Group feedback sent

• The theme is the Network Institute

Page 5: VU University Amsterdam - The Social Web 2016 - Lecture 3

PSA - Deadlines

• Friday February 19th, 10:00 - Post your questions

• Friday, February 19th, 17:00 - Vote your questions

• Friday,February 19th 23:59 - Assignment 1

Page 6: VU University Amsterdam - The Social Web 2016 - Lecture 3

History of Blogs

• evolved from online diaries in 1980’s• ‘weblog’ Jorn Barger (1997) & ‘BLOG’ Peter Merholz (1999)

• used to share the result of Web searches• one of the first ways to contribute (unstructured user-

generated) content on the Web• Justin Hall recognized as pioneer blogger (1994)• Nature: political, technical, art, journalistic, cultural, personal• Software: WordPress, Blogger, LifeJournal

Social Web 2016, Davide Ceolin

Page 7: VU University Amsterdam - The Social Web 2016 - Lecture 3

• single- or multi-authored• photo-blog, Video-blog, Audio-blog• life (b)log, now - microlifeblog (twitter)• lifecasting: in 2007 by Justin Kan: webcam on a cap• Gordon Bell MyLifeBits: Microsoft SenseCam

http://www.justin.tv/

http://research.microsoft.com/en-us/projects/mylifebits/

Types of Blogs

Social Web 2016, Davide Ceolin

Page 8: VU University Amsterdam - The Social Web 2016 - Lecture 3

http://www.flickr.com/photos/kables/1220574200/

• Wiki in Hawaiian meaning fast/quick

• "the simplest online database that could possibly work" (Ward Cunningham)1995

• first wiki software: WikiWikiWeb (the QuickWeb)

• first example for a large scale collaborative editing = software + process

• commonly implemented software package is MediaWiki (known from Wikipedia)

• pages structure & formatting: simplified markup language - wikitext, or HTMLtags, WYSIWYG editing

Wikis

http://c2.com/cgi/wiki?WikiWikiWeb

Social Web 2016, Davide Ceolin

Page 9: VU University Amsterdam - The Social Web 2016 - Lecture 3

http://en.wikipedia.org/wiki/List_of_wikis

http://www.wikimedia.org/

Social Web 2016, Davide Ceolin

Page 10: VU University Amsterdam - The Social Web 2016 - Lecture 3

Exploiting the crowd

• in wiki applications crowd contributes with collective intelligence (primarily textual)

• later also other media & recourses emerged, e.g., photo, video, music

• crowdsourcing

Social Web 2016, Davide Ceolin

Page 11: VU University Amsterdam - The Social Web 2016 - Lecture 3

Mechanical Turk

• 1760 Wolfgang von Kempelen: The Turk

• 2005 Amazon: Amazon Mechanical Turk

• marketplace for work; people perform tasks computers are lousy at, e.g. identifying items in a photo/video, writing product descriptions, transcribing podcasts

• HITs = human intelligence tasks

• require little time & offer little compensation

• workers & requesters

Social Web 2016, Davide Ceolin

Page 12: VU University Amsterdam - The Social Web 2016 - Lecture 3

Crowdsourcing Science

Social Web 2016, Davide Ceolin

Page 13: VU University Amsterdam - The Social Web 2016 - Lecture 3

Crowdsourcing History

Social Web 2016, Davide Ceolin

Page 14: VU University Amsterdam - The Social Web 2016 - Lecture 3

Social Web 2016, Davide Ceolin

Page 15: VU University Amsterdam - The Social Web 2016 - Lecture 3

Was the $ million Netflix prize a victory for crowdsourcing?

Social Web 2016, Davide Ceolin

Page 16: VU University Amsterdam - The Social Web 2016 - Lecture 3

Folksonomy

• On the social web the user-generated content is organized in light-weight ontologies, i.e., folksonomies

• Community-based semantics = a relationship between Users, Tags & Resources

• user-created, bottom-up classification/categorization of (domain) terms / user-labels, e.g., tags

• tagging = the social process where lay users attach labels to resources (as opposed to annotation by professional experts)

Social Web 2016, Davide Ceolin

Page 17: VU University Amsterdam - The Social Web 2016 - Lecture 3

Folksonomy

Social Web 2016, Davide Ceolin

Page 18: VU University Amsterdam - The Social Web 2016 - Lecture 3

Folksonomy

Social Web 2016, Davide Ceolin

Page 19: VU University Amsterdam - The Social Web 2016 - Lecture 3

• cleaning messy data• transforming data from one format to another• fetching missing data

Social Web 2016, Davide Ceolin

Page 20: VU University Amsterdam - The Social Web 2016 - Lecture 3

Questions

How critical is the quality of the data on the Web? Does structured mark-up help?

How do we measure the quality?

Social Web 2016, Davide Ceolin

Page 21: VU University Amsterdam - The Social Web 2016 - Lecture 3

Social Web 2016, Davide Ceolin

Page 22: VU University Amsterdam - The Social Web 2016 - Lecture 3

Social Web 2016, Davide Ceolin

Page 23: VU University Amsterdam - The Social Web 2016 - Lecture 3

Vocabularies on the (Social) Web

• to create interfaces or exchange data between applications the software needs to know the terms in the data

• vocabularies define set of terms in a certain domain, e.g., describing people, relationships, content of different type

Social Web 2016, Davide Ceolin

Page 24: VU University Amsterdam - The Social Web 2016 - Lecture 3

FOAF• FOAF = Friend of a Friend, http://www.foaf-

project.org/,

• a machine-readable ontology describing persons, their activities & their relations to other people and objects

• an open, decentralized technology for connecting social Web sites, & the people they describe

• Since mid-2000

• Stable core of classes & properties

• New terms may be added at any time

• FOAF RDF namespace URI is fixed

• http://xmlns.com/foaf/spec/

• model for publishing simple factual data via a networked of linked RDF documents

• FOAF is an attempt to use the Web to:• integrate factual information with

information in human-oriented documents (e.g. videos, books, spreadsheets, 3d models)

• and info that is still in people's heads

• linking networks of information with networks of people

Linked Data & FOAF

Social Web 2016, Davide Ceolin

Page 25: VU University Amsterdam - The Social Web 2016 - Lecture 3

FOAF Example

• there is a foaf:Person • with a foaf:name property of 'Dan Brickley'• in foaf:homepage and foaf:openid relationships to a thing called http://danbri.org/ • in foaf:img relationship to a thing referenced by a relative URI of /images/me.jpg

Create your own FOAF file: http://www.ldodds.com/foaf/foaf-a-matic

Social Web 2016, Davide Ceolin

Page 26: VU University Amsterdam - The Social Web 2016 - Lecture 3

foaf:depiction

Social Web 2016, Davide Ceolin

Page 27: VU University Amsterdam - The Social Web 2016 - Lecture 3

FOAF Auto-Discovery

• If you publish a FOAF self-description (e.g. using foaf-a-matic) you can make it easier for tools to find your FOAF by putting markup in the head of your HTML homepage

• Common filename foaf.rdf is a common choice

Social Web 2016, Davide Ceolin

Page 28: VU University Amsterdam - The Social Web 2016 - Lecture 3

SIOC• Semantically-Interlinked Online Communities

• ontology for representing rich data from Social Web in RDF

• a standard way for expressing user-generated content

• methods for interconnecting discussions, e.g., blogs, forums & mailing lists; and enable the integration of online community information

• used in conjunction with FOAF vocabulary for expressing personal profile & social networking information

• http://sioc-project.org/

Social Web 2016, Davide Ceolin

Page 29: VU University Amsterdam - The Social Web 2016 - Lecture 3

<sioc:Post rdf:about="http://jbreslin.com/blog/2006/09/07/creating-connections"> <dc:title>Creating connections between discussion clouds with SIOC</dc:title> <dcterms:created>2006-09-07T09:33:30Z</dcterms:created> <sioc:has_container rdf:resource="http://jbreslin.com/blog/index.php?sioc_type=site#weblog"/> <sioc:has_creator> <sioc:UserAccount rdf:about="http://jbreslin.com/blog/author/cloud/" rdfs:label="Cloud"> <rdfs:seeAlso rdf:resource="http://jbreslin.com/blog/index.php?sioc_type=user&sioc_id=1"/> </sioc:UserAccount> </sioc:has_creator> <foaf:maker rdf:resource="http://jbreslin.com/blog/author/cloud/#foaf"/> <sioc:content>SIOC provides a unified vocabulary for content and interaction description: a semantic layer that can co-exist with existing discussion platforms. </sioc:content> <sioc:topic rdfs:label="Semantic Web" rdf:resource="http://jbreslin.com/blog/category/semantic-web/"/> <sioc:topic rdfs:label="Blogs" rdf:resource="http://jbreslin.com/blog/category/blogs/"/> <sioc:has_reply> <sioc:Post rdf:about="http://jbreslin.com/blog/2006/09/07/creating-connections/#comment-123928"> <rdfs:seeAlso rdf:resource="http://johnbreslin.com/blog/index.php?sioc_type=comment&sioc_id=123928"/> </sioc:Post> </sioc:has_reply></sioc:Post>

• A post (1) titled "Creating connections between discussion clouds with SIOC" (2) created at 09:33:30 on 2006-09-07 (3) written by user "Cloud" (4) on topics "Blogs" and "Semantic Web" (5) with contents described in sioc:content.

• (6) More information about its author at http://johnbreslin.com/blog/index.php?sioc_type=user&sioc_id=1

• The post has (7) a reply and (8) detailed SIOC information about this reply can be found at http://johnbreslin.com/blog/index.php?sioc_type=comment&sioc_id=123928

1

2

3

4

5

6

8

7

Social Web 2015, Lora Aroyo and Davide Ceolin

Page 30: VU University Amsterdam - The Social Web 2016 - Lecture 3

Social Web 2016, Davide Ceolin

Page 31: VU University Amsterdam - The Social Web 2016 - Lecture 3

Semantics in Facebook

Social Web 2016, Davide Ceolin

Page 32: VU University Amsterdam - The Social Web 2016 - Lecture 3

Activity Streams

• A list of recent activities performed by someone on a website

• Example: Facebook News Feed

• Activity Streams project aims at an activity stream protocol to syndicate activities across social Web applications

• Major websites with activity stream implementations have already opened up their activity streams to developers to use, e.g., Facebook and MySpace

• http://activitystrea.ms/

Social Web 2016, Davide Ceolin

Page 33: VU University Amsterdam - The Social Web 2016 - Lecture 3

Activity Streams Specification

• an actor, a verb, an object and a target

• person performing an action on/with an object

• Geraldine posted a photo to her album

• John shared a video

• activity metadata to present to a user in a rich human-friendly format, e.g. constructing readable sentences about the activity that occurred, visual representations of the activity, or combining similar activities for display

• Activities are serialized using the JSON format

• There is also an ATOM-oriented specification

Social Web 2016, Davide Ceolin

Page 34: VU University Amsterdam - The Social Web 2016 - Lecture 3

Verbs, Objects, MappingVerbs Objects

http://wiki.activitystrea.ms/w/page/1359319/Verb%20Mapping

Social Web 2016, Davide Ceolin

Page 35: VU University Amsterdam - The Social Web 2016 - Lecture 3

XFN• Xhtml Friends Network

• defining a small set of values that describe personal relationships

In HTML and XHTML, these are given as values for rel attribute on a hyperlink. XFN allows authors to indicate which weblogs belong to friends, whom they've physically met, and other personal relationships. XFN values allow to humanize blogrolls and link pages.

• using XFN can easily style all links of a particular type, e.g, friends could be boldfaced, co-workers italicized, etc.

• http://gmpg.org/xfn/

Social Web 2016, Davide Ceolin

Page 36: VU University Amsterdam - The Social Web 2016 - Lecture 3

XFN Example

• Joe has a set of five links in his blogroll: his girlfriend Jane; his friends Dave and Darryl; industry expert James, who Joe briefly met once at a conference; and MetaFilter.

• MetaFilter gets no value since it is not an actual person

http://gmpg.org/xfn/introSocial Web 2016, Davide Ceolin

Page 37: VU University Amsterdam - The Social Web 2016 - Lecture 3

Open Graph• protocol originally developed in Facebook, “Like” button

• enables web pages to become a rich object in a social graph, i.e. any web page to have the same functionality as any other object on Facebook

• prefix="og: http://ogp.me/ns#" specifies the OGP vocabulary

Social Web 2016, Davide Ceolin

Page 38: VU University Amsterdam - The Social Web 2016 - Lecture 3

Social Web 2016, Davide Ceolin

Page 39: VU University Amsterdam - The Social Web 2016 - Lecture 3

Microformats• Simple, open data formats built upon existing widely adopted standards• Designed for humans first & machines second• Highly correlated with semantic XHTML (aka the real world semantics,

lowercase semantic web, lossless XHTML)• “An evolutionary revolution”, by ryan king• It attempts to re-use existing HTML tags to portray meta information and

other attributes in website pages.

Social Web 2014, Lora Aroyo!Social Web 2016, Davide Ceolin

Page 40: VU University Amsterdam - The Social Web 2016 - Lecture 3

Your First Microformat• You can put a microformat on your website in less than 5 mins

• Example: putting an hCard (online business card) on your site

http://microformats.org/get-started

1. Find your name somewhere on your website2. Wrap your name in an fn (formatted name)

<span class="fn">Jamie Jones</span>

3. Wrap it all in a vcard (declares that everything inside is the hCard microformat):

<span class="vcard"><span class="fn">Jamie Jones</span></span><address class="vcard"><span class="fn">Jamie Jones</span></address>

The address element indicates that the person in the hCard is the contact for the page

<p class="vcard">My name is <span class="fn">Jamie Jones</span> I dig microformats!</p>

Social Web 2016, Davide Ceolin

Page 41: VU University Amsterdam - The Social Web 2016 - Lecture 3

HTML Microdata• Allows machine-readable data to be embedded in HTML documents in an easy-

to-write manner, with an unambiguous parsing model

• Compatible with numerous data formats, including RDF and JSON

• Consists of a group of name-value pairs.

the groups are called items, and each name-value pair is a property

• itemscope is used to create an item

• itemprop is used to add a property to an item

• Microdata DOM API

• http://www.w3.org/TR/microdata/

• It is used to incorporate semantics into existing microformat content on web pages.

Social Web 2016, Davide Ceolin

Page 42: VU University Amsterdam - The Social Web 2016 - Lecture 3

schema.org

• Google, Yahoo!, Bing

• a common vocabulary for structured data markup on web pages

• improve how sites appear in major search engines

• Google rich snippets of reviews, people, recipes, events in 2005

• superseded Microformats

Social Web 2016, Davide Ceolin

Page 43: VU University Amsterdam - The Social Web 2016 - Lecture 3

Add Schema.org to HTML using Microdata

<div> <h1>Avatar</h1> <span>Director: James Cameron (born August 16, 1954)</span> <span>Science fiction</span> <a href="../movies/avatar-theatrical-trailer.html">Trailer</a></div>

<div itemscope itemtype ="http://schema.org/Movie"> <h1 itemprop="name"&g;Avatar</h1> <div itemprop="director" itemscope itemtype="http://schema.org/Person">

Director: <span itemprop="name">James Cameron</span> (born <span itemprop="birthDate">August 16, 1954)</span> </div> <span itemprop="genre">Science fiction</span> <a href="../movies/avatar-theatrical-trailer.html" itemprop="trailer">Trailer</a></div>

Social Web 2016, Davide Ceolin

Page 44: VU University Amsterdam - The Social Web 2016 - Lecture 3

RDFa

• Another syntax for RDF

• HTML5 extension for People, Places, Events, Recipes, Reviews markup

specify that a text is the name of a product, or person, or event = “adding semantic markup”.

• RDFa 1.1 = specified for XHTML and HTML5 (for any XML-based language, e.g., SVG)

• RDFa Lite = “a small subset of RDFa consisting of a few attributes that may be applied to most simple to moderate structured data markup tasks.”

• Publish your data as Linked Data through RDFa --> link to other URIs (others can link to your HTML+RDFa)

http://rdfa.info/play/

Social Web 2016, Davide Ceolin

Page 45: VU University Amsterdam - The Social Web 2016 - Lecture 3

Quick Structured Data for Your website

Social Web 2016, Davide Ceolin

Page 46: VU University Amsterdam - The Social Web 2016 - Lecture 3

Social Web 2016, Davide Ceolin

Page 47: VU University Amsterdam - The Social Web 2016 - Lecture 3

Knowledge Graph

• graph that understands real-world entities and their relationships to one another: things, not strings

• more than 500 million things

• more than 3.5 billion facts about and relationships between these different things

• tuned based on what people search for

• http://www.google.com/insidesearch/features/search/knowledge.html

results in 2013

Social Web 2016, Davide Ceolin

Page 48: VU University Amsterdam - The Social Web 2016 - Lecture 3

Knowledge Graphresults in 2014

Social Web 2016, Davide Ceolin

Page 49: VU University Amsterdam - The Social Web 2016 - Lecture 3

Knowledge Graph results in 2014

Social Web 2016, Davide Ceolin

Page 50: VU University Amsterdam - The Social Web 2016 - Lecture 3

results in 2013 results in 2014

Social Web 2016, Davide Ceolin

Page 51: VU University Amsterdam - The Social Web 2016 - Lecture 3

results in 2014

results in 2013

Social Web 2016, Davide Ceolin

Page 52: VU University Amsterdam - The Social Web 2016 - Lecture 3

Question?

For which things on the social web would more vocabularies for embedded semantics be needed

(besides what we have already seen)?

Social Web 2016, Davide Ceolin

Page 53: VU University Amsterdam - The Social Web 2016 - Lecture 3

image source: http://www.flickr.com/photos/bionicteaching/1375254387/

Hands-on Teaser

• mining data in various social web formats • see the differences in what each of the formats can

contain & what purpose they serve• start: simple search where we pull in some XFN data and

visualise a graph of people that we find on a website • check: software you will be working with on the website

Social Web 2016, Davide Ceolin