81
@MikeAtherton | #IAS14 LINKED OPEN DATA Mike Atherton RedUXD The path ahead WEB-SCALE IA USING @MikeAtherton | #IAS14

Web-scale IA using Linked Open Data

Embed Size (px)

DESCRIPTION

A preview of the talk I'll be giving at the 2014 IA Summit in San Diego. An introduction to the web of data, and how the BBC and other organisations create products which remix original content with third-party data.

Citation preview

Page 1: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

LINKED OPEN DATAMike Atherton

RedUXD

The path ahead

WEB-SCALE IA USING

@MikeAtherton | #IAS14

Page 2: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Tim Berners-Lee 1989

Page 3: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Defining standards• Use a common format for publishing

documents (HTML)

• Use a common system of addresses to identify and locate documents (URL)

• Establish a method of contextual linking between documents (HREF hyperlink)

Page 4: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14 Actual feedback on Tim Berners-Lee’s proposal

Page 5: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14 WEBSITES! WEBSITES! PARTY TIME! EXCELLENT!

Page 6: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

What wonderful things we wrote for people!

@MikeAtherton | #IAS14

Page 7: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

As humans we can extract meaning and context from documents automatically.

Spot the difference.

Page 8: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The context of keywords doesn’t travel with them.

Tag: “Apple”

Page 9: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

We can pick out the important things and relationships just by reading.

For humans, the distinction between documents and data is subtle.

Page 10: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

By defining real-world things, we can teach computers the relationships between those things.

Computers need to be told which things our documents contain.

Page 11: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

If a computer knows what ‘Mount Everest’ is and what ‘tall’ means, it can do the legwork for us.

“How tall is Mount Everest?”

Page 12: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

By understanding terms and linking to data services, computers can even find out things they don’t know.

“Where can I get a beer?”

Page 13: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Actual queries from Facebook’s Graph Search tool.

Cross-referencing data points gives new insight.

http://actualfacebookgraphsearches.tumblr.com/

Page 14: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

TED conference 2009

Page 15: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Use web addresses to represent real-world things

Tim Berners-Lee

Rule #1 of data publishing

Page 16: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Return useful data about each resource, in a standard format.

Tim Berners-Lee

Rule #2 of data publishing

Page 17: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Include links to other data, so people can discover more things.

Tim Berners-Lee

Rule #3 of data publishing

Page 18: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Linked data• Use web addresses to represent real-world

things

• Return useful data about each resource, in a standard format.

• Include links to other data, so people can dissever more things.

Data sources combined create more insight than studying them separately.

Page 19: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Researchers attempting to discover new drugs to treat Alzheimer’s Disease.

“Which proteins are involved in signal transduction AND are related to pyramidal neurons?”

Web search 223,000 results, 0 answers

Linked healthcare data query 32 results, 32 answers

Page 20: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

We can even create entirely new value propositions from remixing existing content.

Linked data helps us make sense of information.

data.gov.uk Newspaper Hyper-local news publishing

Land Registry Price Paid Historical property data

Voter power Local constituency data

Page 21: Web-scale IA using Linked Open Data

A MINIMUM VIABLE PRESENTATION FOR IA SUMMIT 2014

@MikeAtherton | #IAS14

CONTENT MODELS AT WEB-SCALE

Where next for your content model?

Page 22: Web-scale IA using Linked Open Data
Page 23: Web-scale IA using Linked Open Data

THEME PARKS CONTENT MODEL

Location

Resort

Park

Hotel

Weenie

Land

Meal

Restaurant

Attraction

Character

Creator

Work

locatedIn

hasWeenie

ParentResort locatedIn

locatedIn

hasEvent

features

adaptationOf

adaptationOfhasAttraction

CreatedBy

appearsIn

hasPark

contains

Page 24: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

But ideally, those addresses should offer robot-readable data.

Use http web addresses to represent real-world things.

http://disneyland.disney.go.com/attractions/disneyland/haunted-

mansion/

http://www.geonames.org/ontology#locatedIn

http://en.wikipedia.org/wiki/New_Orleans_Square

The Haunted Mansion (is) located in New Orleans Square

Page 25: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The Resource Description Framework is the web’s lingua franca for data integration.

RDF lets different data sources play nice together.

<subject> <predicate> <object>

<Charles Dickens> <is the author of> <Great Expectations>

Page 26: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

RDF is an abstract syntax, so all these ‘serialisations’ are equivalent.

RDF can be written in different ways.

RDF/XML

Turtle

<http://dbpedia.org/resource/Charles_Dickens> <http://dbpedia.org/ontology/author> <http://dbpedia.org/resource/Great_Expectations> <http://dbpedia.org/resource/Charles_Dickens> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://dbpedia.org/ontology/Person>

<?xml version="1.0" encoding="utf-8"?> <rdf:RDF xmlns:rdf="http://www.w3.org/1999/02/22-rdf-syntax-ns#"> <dbpedia-owl:Person xmlns:dbpedia-owl="http://dbpedia.org/ontology/"

rdf:about=“http://dbpedia.org/resource/Charles_Dickens"> <dbpedia-owl:artist rdf:resource=“http://dbpedia.org/resource/Great_Expectations”/>

</dbpedia-owl:Person> </rdf:RDF>

@prefix rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> .

<http://dbpedia.org/resource/Charles_Dickens> a <http://dbpedia.org/ontology/Person> ;

<http://dbpedia.org/ontology/author> <http://dbpedia.org/resource/Great_Expectations>

N-Triples

Page 27: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Your data

BBC

New York Times

Page 28: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

But only for humans! What if we had a common way to define a concept for a robot?

Wikipedia is great for defining individual concepts.

Page 29: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

It turns Wikipedia content into machine-readable linked data.

DBpedia is Wikipedia for robots.

Page 30: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Ok computer…• When did Disneyland first open?

• What is its official homepage?

• Who operates the park?

• When is it open?

• What’s it’s theme?

Page 31: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

It crowdsources music metadata for use by humans and robots.

MusicBrainz is the open music encyclopedia.

Page 32: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

By saying our concept is the ‘same as’ an accepted identifier, we all speak the same language.

Shared identifiers act as intermediaries.

http://dbpedia.org/resource/China

Page 33: Web-scale IA using Linked Open Data

A MINIMUM VIABLE PRESENTATION FOR IA SUMMIT 2014

@MikeAtherton | #IAS14

BBC MUSICHow BBC Music used linked data to get more people listening to the radio.

Page 34: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

How can I find out which one I should listen to?

10 national BBC radio stations.

Page 35: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

A continuously-updated record of every song played on-air.

The BBC radio playout system was a data goldmine.

Page 36: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The sources combine to create a new and useful product.

Linked data builds a composite picture of the world.

+Which song is playing on the radio now? Who is this artist?

What other stuff has this artist done? What TV or radio clips of this artist do we have?

Page 37: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Page 38: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

By exposing common identifiers, your website becomes its own API.

Maintaining identifiers in your URIs makes playing with your stuff easier.

http://www.bbc.co.uk/music/artists/4d2956d1-a3f7-44bb-9a41-67563e1a0c94

http://musicbrainz.org/artist/4d2956d1-a3f7-44bb-9a41-67563e1a0c94

Page 39: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Creating an artist profile on MusicBrainz automatically creates a BBC Music artist page.

Linked data lets you use the web as a content management system.

!freshonthenet.co.uk/musicbrainz/

Page 40: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Filling in the blanks• The BBC knows when it's played a record by Tom Waits

• MusicBrainz knows all the records Tom Waits ever released

• DBpedia knows Tom Waits is from San Diego

• DBpedia knows Blink-182 are also from San Diego

• The BBC knows when it's played a record by Blink-182

Page 41: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The ‘SPARQL Protocol and RDF Query Language’ lets us query linked data as easily as a local database.

If you want linked data magic, try SPARQL.

SQL: Centralised relational queries SPARQL: Distributed graph queries

Page 42: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

PREFIX rdf: <http://www.w3.org/1999/02/22-rdf-syntax-ns#> PREFIX dbpedia-owl: <http://dbpedia.org/ontology/> PREFIX dbpprop: <http://dbpedia.org/property/> ! SELECT ?s ?title ?author WHERE { ?s rdf:type dbpedia-owl:Book. ?s dbpedia-owl:author ?author_uri . ?author_uri dbpedia-owl:birthName ?author . ?s dbpprop:name ?title . FILTER (REGEX(STR(?title), "Great Expectations", "i")) }

SPARQL query ‘Who wrote Great Expectations?’

The places where the terms we’ll use are defined

Bring back the stuff that matches what I’m about to say…

…which is the birth name of anything said to be the author of a book…

…but only if that book is titled ‘Great Expectations’

Page 43: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The various sources that make up BBC Wildlife Finder are structured according to the Wildlife Ontology.

The ontology defines how everything hangs together.

http://dbpedia.org/resource/Giant_panda

http://www.bbc.co.uk/programmes/p00k3nx

http://www.bbc.co.uk/news/world-asia-china-24784767

http://worldwildlife.org/species/giant-panda http://www.iucnredlist.org/details/712/0

http://www.bbc.co.uk/ontologies/wildlife/2010-11-04.shtml

Page 44: Web-scale IA using Linked Open Data

A MINIMUM VIABLE PRESENTATION FOR IA SUMMIT 2014

@MikeAtherton | #IAS14

ONTOLOGIESCreating, publishing, and consuming the words that help us mean what we say.

Page 45: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14BBC Wildlife Ontology

http://www.bbc.co.uk/ontologies/wildlife/2010-11-04.shtml

Page 46: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

BBC News attempted to model news coverage to better represent how events are related.

Ontologies start with a high-level understanding of the IA.

Page 47: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The chronological chain of events and the graph of supporting coverage are modelled to aid understanding.

News updates connect in sequence to a storyline.

Page 48: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

BBC News give journalists the tools to tag stories with web-scale identifiers as they write.

Articles and storylines are tagged with people, places, and subjects.

Page 49: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

News Storylines model http://purl.org/ontology/storyline

Page 50: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14Detail from the News Storyline ontology

http://purl.org/ontology/storyline

Page 51: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Used by NYT to aggregate news stories around themed topic pages.

The New York Times offers identifiers for people, places, and subjects.

Page 52: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Mashing up sources of data can yield playful or surprisingly useful results.

Linked open data weaves tales of the unexpected.

Page 53: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Earn your stars!make your stuff available under an open license

make it available as structured data

use non-proprietary formats

use URIs to identify things

link your data to other data to provide context

Page 54: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Linked Open Data cloud 2007

Page 55: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Linked Open Data cloud 2011

Page 56: Web-scale IA using Linked Open Data

A MINIMUM VIABLE PRESENTATION FOR IA SUMMIT 2014

@MikeAtherton | #IAS14

FIRST STEPS WITH LINKED DATA

This all sounds awesome! Now what?

Page 57: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

1. Markup content with RDFa• RDFa is RDF embedded into HTML code to state our subject, predicate, and object.

• Typically: <subject>: The page we’re adding the markup to<predicate>: The verb, as defined by an an external vocabulary<object>: The external URI we’re expressing a relationship to

Page 58: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

RDFa in action<div class="vote2013-council-meta" resource="http://www.bbc.co.uk/news/politics/councils/[GSSID]">

<div vocab=“http://iptc.org/std/rNews/2011-10-07#” rel="about" resource="http://www.bbc.co.uk/things/[GUID]#id">

<div vocab="http://www.w3.org/2002/07/owl#" rel="sameAs" resource="http://opendatacommunities.org/id/[COUNCIL-TYPE]/[COUNCIL-NAME]"></div>

<div vocab="http://www.bbc.co.uk/ontologies/politics#" rel="governsGSS" resource="http://statistics.data.gov.uk/id/statistical-geography/[GSSID]"></div>

</div>

</div>

Define which city council this page is about.

Define what we mean by ‘about’ using the rNews vocabulary.

State that the city council we’re talking about is the same one referenced at Open Data Communities.

Using our own ontology, state that this council governs a region identified on data.gov.uk

Thanks to @r4isstatic for this example!

Page 59: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

2. Publish an ontology• Ontologies describe your content model in detail, defining the vocabulary for things,

types of thing, and types of relationship:

• Classes: ‘person’, ‘book’, ‘wine’

• Properties: ‘age’, ‘ISBN’, ‘hasDistillery’

• Individuals: ‘Charles Dickens’, ‘Great Expectations’, ‘Laphroaig’

Page 60: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Many ontologies are published and available for reuse, or to build upon.

Ontologies are guidebooks to help us explore and understand subjects.

Schema.org General purpose vocabulary

FOAF Person-to-person relationships

rNews News story publishing

Page 61: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

3. Make your CMS work harder• Content management systems mostly suck for this stuff, but some - like Drupal and

Umbraco - have growing support for publishing RDF (and other semantic formats).

• These systems even have some SPARQL support allowing linked data to be added to your own page views.

• Most showcase linked data projects aren’t using an off-the-shelf CMS, but things are improving with more semantically-friendly CMSs like Webnodes and Ximdex.

Page 62: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

<rdf:Description rdf:about="/nature/species/Giant_Panda"> <foaf:primaryTopic rdf:resource="/nature/species/Giant_Panda#species"/> <rdfs:seeAlso rdf:resource="/nature/species"/> </rdf:Description> ! <wo:Species rdf:about="/nature/life/Giant_Panda#species"> <rdfs:label>Giant panda</rdfs:label> <wo:name rdf:resource="http://www.bbc.co.uk/nature/species/Giant_Panda#name"/> <foaf:depiction rdf:resource="http://ichef.bbci.co.uk/naturelibrary/images/ic/640x360/g/gi/giant_panda/giant_panda_1.jpg"/> ! <dc:description>The giant panda is a rare, endangered and elusive <a href="http://www.bbc.co.uk/nature/life/Bear">bear</a>, making the videos below of a newborn baby giant panda and the remarkable courtship scene filmed in the wild unique. Giant pandas are famous for their love of bamboo, a diet so nutritionally poor that the pandas have to consume up to 20kg each day. The extra digit on the panda's hand helps them to tear the bamboo and their gut is covered with a thick layer of mucus to protect against splinters. Habitat loss is the greatest cause of the giant panda's decline, and today their range is restricted to six separate mountain ranges in western <a href="http://www.bbc.co.uk/nature/places/China">China</a>. <br/> <br/> <b>Did you know?</b><br/> A giant panda is born pink, hairless, blind and 1/900th the size of its mother. <br/></dc:description> <owl:sameAs rdf:resource="http://dbpedia.org/resource/Giant_Panda"/>

http://www.bbc.co.uk/nature/life/Giant_Panda http://www.bbc.co.uk/nature/life/Giant_Panda.rdf

Human version Robot version

Page 63: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

4. Be a pirate!• Use your model to audit the content you have ready to go.

• Find the gaps - the concepts which are important to the subject domain, but which you don’t have content for.

• Sail the high seas in search of third-party content or data.

• Enrich your content with third-party data, then pay it forward by publishing linked data back out to the web.

Page 64: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Islands of treasure await the brave adventurer.

Page 65: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

5. IAs should code<CONTROVERSY KLAXON!>

• Unexpected ideas can come from throwing different sources of data together.

• A little coding knowledge goes a long way toward building rough prototypes which prove concepts. Right now, more designer-friendly tools don’t exist.

• Python and Rails are popular choices among IAs who don’t mind a little hacking.

• If UX designers are encouraged to use native web tools, shouldn’t we also?

Page 66: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

DBpedia Animal descriptions

Freebase Species taxonomy

Geonames Location data

BBC Wildlife finder Video clips

Flickr API Tagged photos

‘Wildlife Near You’ was an experiment in bootstrapping an

entire content-rich product from no original content whatsoever.

Page 67: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

All the world’s a stage. What stories will we tell?

Page 68: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Consider how your offering benefits the web as a whole.

Stitch your content into the fabric of the web.

Page 69: Web-scale IA using Linked Open Data

A MINIMUM VIABLE PRESENTATION FOR IA SUMMIT 2014

@MikeAtherton | #IAS14

THE PATH AHEADWhere next for the web, and for information architecture?

Page 70: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

It’s been an amazing ride, but the best is yet to come.

Page 71: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The web was designed to break down barriers.

Page 72: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The web was designed to build bridges of understanding.

Page 73: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14 https://www.flickr.com/photos/raveland

Time to let the robot army do the heavy lifting.

Page 74: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Time to tool up and face the challenges that lie ahead.

Page 75: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

‘Designers should code’. But that’s not code…

Page 76: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

That’s code! With zombies.

Page 77: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Today the tools are rough and ready, as once they were for HTML.

Page 78: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

The Linked Data and Information Architecture communities have much

to discuss.

Page 79: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Information Architecture must continue to evolve, learn from others, and expand its range of influence.

Page 80: Web-scale IA using Linked Open Data

@MikeAtherton | #IAS14

Ready to play?

Page 81: Web-scale IA using Linked Open Data

Thanks for listening.This presentation now available at http://slideshare.net/reduxd

To find out more about getting started with Linked Data, visit EUCLID. http://euclid-project.eu/

Dedicated to the coalition of the willing: Silver Oliver @silveroliver Michael Smethurst @fantasticlife Paul Rissen @r4isstatic Tom Scott @derivadow Leigh Dodds @ldodds Chris Sizemore @onpause and the London and Reading Linked Data Meetup groups

Interested in content modelling? http://www.slideshare.net/reduxd/beyond-the-polar-bear