BBC Music : going native on the web

BBC Music : Going native on the web

Tom Scott and Matthew Shorter

BBC W1

a bit of background

http://www.flickr.com/photos/jeffsmallwood/299208539/

muze

not connected with the rest of bbc.co.uk

not connected with the rest of the web

google hates us

bbc.co.uk is incoherent because it’s unconnected.

Much of what we produce and broadcast is difficult to find via Google (other search indexes are also available).



our strategy


build music credibility online for the BBC

help people find new programmes they will love

help people find new music

become the de facto place on the web for artist information

be part of the web



being part of the web


persistent urls for every resource

semantically linked and accessible to man and machine

lots of links to others

permissive license

in other words : “linked open data”

TimBL described four simple rules to do the web right:1.Use URIs to identify things on the web as resources2.Use HTTP so people can dereference them3.Provide information about the resource when it is dereferenced4.Include onward links

What this gives you is a highly interlinked web of resources - where each resource is linked to other resources that are contextually relevant.

The idea of open linked data adds a new requirement - that of permissive licensing - so others can reuse data in new contexts.

But why?

Because it benefits us now and in the long terms. Publishing a web page or any other piece of content online is useful but if it is part of a network then it’s value is greatly increased. This is the Network effect.

One consequence of the network effect is that the addition of a node by one individual indirectly benefits others who are part of the network — for example by purchasing a telephone a person makes other telephones more useful.

By building the web in this fashion our new artist pages, although useful in their own right, become much more useful when they are joined to programmes - directly linking to those programmes that feature that artist, the same goes for events. And of course the network effect goes both ways; it goes all ways. Linking artists to programmes also makes the programme pages more valuable - because there is now more context, more discovery and serendipity.

And that’s just within the BBC. By joining our data with the rest of the web the Network Effect is magnified yet further. And that has benefit to the BBC. But it also benefits the web at large. The BBC has a role that transcends it’s business needs - we can help create public value around our content for others and for individuals and businesses.



linked open data graph


This then is the LOD graph - a graph representing all the data sources that are semantically linked and accessible to man and machine AND published under a permissive license.

And this month the BBC has added two nodes to this graph - BBC programmes and BBC playcount data (and artist pages).



implications


different scale to anything the BBC has done before

need to automate as much as possible

need to link to existing BBC content

need to let others use this data

As we’ll see there are lots and lots of pages - 100,000s of them and they all need to be contextually linked up.

We needed to automate as much as possible - integrating with broadcast systems and data elsewhere on the web.

We also released our data via APIs under the backstage, non-commerical license.



approach


using the web as a cms with reactive moderation

musicbrainz to provide core metadata + web scale identifiers

work with others to encourage the adoption of musicbrainz

wikipedia to provide basic biographical information

integrating with broadcast systems and PIPs

integration with news.bbc.co.uk

So somewhat ironically - in a world where we are trying to reduce the number of Content Management Systems we have in effect ended up using a new one - the web itself.

MusicBrainz provides metadata about artists, releases and labels and possibly more interestingly web scale identifiers. These IDs, the code at the end of the URLs are unique to each artist and are being used by us, MusicBrainz obviously and Last.fm.

Because the more people the more sites that use these Identifiers the better for everyone - because it makes it easier to link everything up - we are working to encourage the adoption elsewhere in the industry - for example NME and the commercial radio networks.

Because MusicBrainz includes URLs for wikipedia we can go and fetch, the introductory biographical text for each artists. We then monitor wikipedia (via the IRC channel) for updates to those pages.

We are therefore able to get near realtime updates from wikipedia and updates from musicbrainz within an hour.

Internally… integrating with our broadcast systems and pushing this data into the programmes space

And finally because news stories tend to include links to the official artist homepage when they cover a story by an artist we can look for these and match them to URLs in MusicBrainz (which also include the ‘official site URLs’) this gives us a neat and automated mechanism to cross reference BBC news stories from artist pages.



results


380,000 persistent artist pages

integration with /programmes

creative commons licensed album reviews

web, mobile and machine views



coming soon

A sneaky peak at the artist pages in the not too distant future...

coming soon

News stories and blogs contain links to the official artist pages, MySpace pages etc. in other words links that we know are about the artist because they are also in MusicBrainz.

So we can monitor BBC blogs and news stories for these URLs and if we find one then we know that the story is about that artist and then add the link to the artist page.

Album reviews - brought inline with the new visual design and delivered via the new tech stack.

Creative Commons license.

open data

Releasing our reviews under a creative commons license means others can use them (as long as they abide by the terms of the license). Like Channel 4 are doing with Abbey Road.

I think this is brilliant - but it did raise a few eyebrows in certain quarters!

future

link to programmes and events

releases, tracks and works

off-schedule content

recommendation and personalisation

a new music site

third party content…

So what might the future look like?

There are a bunch of things that we want to do with BBC content and new functionality, plus working with MusicBrainz to extend the existing schema.

But we are also interested in how we might work with third party services. For example...

future maybe?

What about photos for Flickr, or elsewhere?

Or...

future maybe?

Data from Last.FM - we could include similar artists or Top 10 tracks (attention data).

future maybe?

Or news from blogs - via Technorati

sources?

biographies - wikipedia

photos - flickr, smugmug, photobucket etc.

music metadata [including links] - musicbrainz

blogs - technorati, google

attention data - last.fm

audio and video - ???

But there are a number of sources for this data - what should which use? which content areas should we consider? What criteria could we use to make a selection?

possible selection criteria

technical feasibility

unique content

best of breed

license to use & downstream rights for content

terms of use for api

existing moderation/ filters

non-commercial

Selection criteria might include... what else? are these right?

Technology

BBC Music : going native on the web