Upload
mathieu-daquin
View
4.619
Download
1
Embed Size (px)
DESCRIPTION
Presentation at the industry track of the international semantic web conference (ISWC), 2012, Boston
Citation preview
Linked Data at The Open University: From Technical Challenges to
Organizational Innovation
Mathieu d’Aquin (@mdaquin)
Knowledge Media Institute
Stuart Brown (@stuartbrown)
Communication Services
The Open University
What are we doing in the industry track?
Knowledge Media Institute: leading research center on semantic web technologies:
– Ontology engineering, ontology discovery
– Knowledge representation, reasoning, problem solving
– Interoperability, services, ontology matching, data linking
– 80 researchers/research assistants/PhD Students/ Academic-related staff
– 100s of publications
http://kmi.open.ac.uk
And So?
KMi is a department of the Open UniversityThe Open University:
– The largest university in the UK: 250K students per year, 8000 associate lecturers, a big campus in Milton Keynes
– Created in 1969– Almost entirely open and distance
learning– 13 regional center, more national
centers, courses available in a large number of countries
Big organization = crazy information infrastructure
data.open.ac.uk
The first linked data platform providing open information from a across a whole university
Lots a (types) of data
Course information: 580 modules/ description of the course, information about the levels and number of credits associated with it, topics, and conditions of enrolment.
Research publications: 16,000 academic articles / information about authors, dates, abstract and venue of the publication.
Podcasts: 2220 video podcasts and 1500 audio podcats / short description, topics, link to a representative image and to a transscript if available, information about the course the podcast might relate to and license information regarding the content of the podcast.
Open Educational Resources: 640 OpenLearn Units / short description, topics, tags used to annotate the resource, its language, the course it might relate to, and the license that applies to the content.
Youtube videos: 900 videos / short description of the video, tags that were used to annotate the video, collection it might be part of and link to the related course if relevant.
University buildings: 100 buildings / address, a picture of the building and the sub-divisions of the building into floors and spaces.
Library catalogue: 12,000 books/ topics, authors, publisher and ISBN, as well as the course related.
Others…
Planning + Logging
Collect Extract Link Store Expose
OntologiesScheduler
RSS Updater
Triple Store
Delete (1)Add (2)
SPARQLendpoint
Web Server
RSS Extractor
XML Updater
RDF Extractor
RDF Cleaner
Cleaning rules
Each datasets
Lib, courses, loc
ORO, podcast
URL redirection rules
RSS feed
New itemsObsolete items
RDF file (add) RDF file (delete)
RDF file (add) RDF file (delete)
Generic process Dataset specific process
Entity Name
SystemURI creation rules
First Issue: Convincing People
Not the technical bit… that’s easydata.open.ac.uk is now a core infrastructure element of the Open University. But it took a lot of talking…
Initial Meeting with Data Owner
- Identify data- Get sample data- Identify Copyright Issues- Identify possible links- Identify users and usage
Data Modeling sessions
LD Team
Data Owner
LD Team
LD Team
- Find reusable ontologies- Map onto the data- Identify uncovered parts- Define URI Scheme
Data Modeling Validation
LD Team
Data Owner
Development of Extractor
URI Creation Rules
DefinitionDeploymentLD Team
Where most of the work is done
What works
Not changing the way people work:– Pull data from feed exported by
the original system. Not changing them, or introducing any additional difficulty.
– Data taken “as is”, with an effort on understanding its original modeling.
Bring the user along, add value– Data re-modeling as linked data
creates positive side effects on the original data
– Talk about possible links and usages first
– Improve the usability of the data produce by the institution = make the work of people more visible and useful
So, what it is that we can do?
Oh no! We can’t find a killer app!
“Small things” that either were impossible before, or are now trivial to do
Ben, in the corridor:
Time difference in answering this question: x weeks 5 minutesnot really feasible easy
Hey! Your linked data thing, can it tell me what are the podcasts attached to courses that we no longer offer?
Simple works… A lot of “simple” works!
Resource Discovery
ResearchExploration
Social
Example: map of buildings
Interactive map of Open University Buildings in the UK
Built in 1 hour
Connected to Ordnance Survey for location based on post-codes
Allowed us to find out about issues in the data.
Example: Connecting our resources
Show the courses and podcasts that connect to a piece of open educational resources.Trivial with data.open.ac.uk Impossible to do before (!)
Simple things as examples: Inspire
The simple apps above are not demonstrating particularly impressive technical achievements: They are here to show what can be done (easily)
Study at the OU mobile application(Communication and student services)
Supporting Research Evaluation(Research School)
Discovery of open educational resources (Open Media Unit and KMi)
So the technology is mature enough after all?No!
Providing an open SPARQL endpoint is a very bad idea: 1 query can kill everything (and it does… often)
Our approach: leave things open, fix when it breaks
Example: Mirror triple-stores updated in parallel
Example: Simple cache based on serving static files for most popular URIs/query. The cache is updated with the data.
Keep the standard/open/application independent interface: Free and easy reuse helps innovation, an API is an obstacle.
Conclusion
Starting point: Showing off our technology, information integration issues, access to open information
Where we got:(Open) innovation, competitive advantage, Linked Data as part of the backbone of the University’s information infrastructure, new systems built doing linked data by design
BTW, anybody has a better word than backbone for linked data based information infrastructure?
Going further…
We are not the only University (anymore):data.southampton.ac.uk, data.ox.ac.uk, data.aalto.fi, data.uni-muenster.de…
Ultimately, the University does not count: moving to “education à la carte”
The Open University
University of Bristol
University of Southampton
mEducator
University of Muenster, DE
OrganicEduNet
Data.gov.uk education
Orgs., Buidings, Locations
Research outputs
Learning resources
What we need: Community effort
LinkedUniversities.org
What we need: A reusable toolsetMarimba4lib.com
What we need: Compelling, global use cases
LinedUp-Project.eu
Thanks!More info:
http://data.open.ac.ukhttp://lucero-project.org
http://linkeduniversities.org http://linkedup-project.eu
http://mdaquin.net [email protected]
What it provides
• Linked Data: URIs resolve with redirects to RDF and HTML, content negotiation
• CC-By license on everything• SPARQL endpoint (SPARQL 1.1)
• That’s all…
Integration
• Big organization = crazy information infrastructure
• Special focus on open/public information:– Course information– Open educational resources
(OpenLearn)– Multimedia material: Podcast
repository, iTunes U, openly licensed Youtube videos
– Open access repository of research publications
Example: Charting our offering
Showing basic charts generated from the answers to SPARQL queries
The only effort required is coming up with the query
Lean-back OU podcast channel on Google TV(IT services)
More on using the users…
Obviously: Won’t be convinced by technological blablaShow examples of what it can do! (BTW we have a problem here…)
More importantly: ask them what they would like to do. We asked: communication services, library people, student services, marketing services, faculties… (they are very creative)
A typical email of my inbox (inspired by a true story):Hi Mathieu, Stuart told me that your linked data thing might help with the problem Laura had with Guy’s system. Can we have a chat?Tx, Ben.
The OU’s presence in the media(Media relations)
Academics in “Arts and Humanities” most often involved with the media (in number of news items)
Topics most commonly mentioned by news outlets own by the BBC (in number of news items)