Mapping the Guardian's tags to the web of data

Mapping the Guardian's tags to the web of data

Peter Martin & Martin Belam

Guardian News & MediaNovember 2010

Our content model relies on tags...

...which are not anywhere near as boring as you think

Keywords

Contributor

Series

Publication

Tone

Article

Video

Audio

Gallery

Cartoon

Tags Content

Keywords

Every piece of content carries aselection of hand-picked tags

They are added during content production

...and the system suggests them as you type

There is also a tag browser in the CMS

And a search so that you can 'Batch Tag'

There is an admin interface to manage tags...

...and generate reports on what has been created

On the site they give us related links & tag pages

(OK, that is admittedly a little bit boring)

They allow us to cross-promote content

A film review for "The Damned United" is inamongst the football stories

And we can create 'combiner' pages with them...

...many of which are more useful than bullfighting+vuvuzelas

This page is assembled automatically by combiningthe 'review' tone with the 'books' section

Tags are used to place editorial components

Stories tagged with 'Apple' in the Technology section display recent tweets on the topic by Guardian contributors

And to customise commercial components

Adverts that appear in the Guardian Jobs slotare tuned by the tags applied to article content

Topical navigation on the iPhone

The Guardian iPhone app uses tags to providelateral navigation into topics

Topical navigation on the iPhone

The Guardian iPhone app uses tags to providelateral navigation into topics

Trending on the iPhone

The iPhone app also examines the tags withthe most activity, to produce the 'trending' topic index

Tags help with search results

We use links to tag pages as results for synonymsand near-synonyms commonly used by readers

Tags can go in folders

...and we can turn those folders into A-Z listsand navigation on the website

And our tags are on Twitter

To our knowledge, they are the only bit of our informationarchitecture to have an official presence on Twitter

Now our tags are entering the world of linked data

Our book reviews carry ISBNs

And our content API can be queried by ISBN

http://explorer.content.guardianapis.com/

Our artist tag pages have MusicBrainz IDs associated with them

And the API can be queried by MusicBrainz ID

Why XML and JSON?

And not something a little more rich and semantic?

Where do we go next?

Where can we get the most linked data for the least effort?What will be used in the real world?

Mapping the Guardian's tags to the web of data

Peter Martin & Martin Belam

Guardian News & MediaNovember 2010

@currybet@guardian_tags