35
exchange ideas / share knowledge

Videolectures for ocwc2010

Embed Size (px)

DESCRIPTION

 

Citation preview

Page 1: Videolectures for ocwc2010

VIDEOLECTURES.NETexchange ideas / share

knowledge

Page 2: Videolectures for ocwc2010

Outline of the talk

About videolectures.net and K4A

Technical solutions in preparationTowards the content personalisation

Automatic Transcriptions

Enhanced Recommender Services

Visitors analytics

OCWC on videolectures.net

Page 3: Videolectures for ocwc2010

Jozef Stefan Institute Department of Knowledge Technologies @ Center for Knowledge Transfer

Selection of FP6 & FP7 Projects (Integrated Projects and Networks of Excellence only):FP7 IP ACTIVE – Enabling the Knowledge Powered EnterpriseFP7 IP COIN – COllaboration and INteroperability for networked enterprisesFP7 IP EURIDICE – Inter-Disciplinary Research on Intelligent Cargo for Efficient, Safe and Environment-friendly LogisticsFP7 NoE PASCAL2 – Pattern Analysis, Statistical Modeling and Computational Learning FP7 NoE T4ME – Machine Translation & Multilingual Information RetrievalFP6 IP NeOn – Lifecycle Support for Networked OntologiesFP6 IP ECOLEAD – European Collaborative Networked Organizations Leadership InitiativeFP6 IP SEKT – Semantically-Enabled Knowledge Technologies

Jozef Stefan Institute (JSI) is the leading Slovene research institution for natural sciences (800+ people) in the areas of computer science, physics, chemistry

Department of Knowledge Technologies have around 60 people working in various areas of artificial intelligence (machine learning, data mining, semantic technologies, computational linguistics, decision support)

Spinoff-s: Cyc-Europe, Quintelligence, LiveNetLife, Temida, XLab

Selection of Portals and Products: Text-Garden (http://www.textmining.net) Enrycher (http://enrycher.ijs.si/) VideoLectures.NET (http://videolectures.net/) IST-World (http://www.ist-world.org/) Project Intelligence (http://pi.ijs.si/) Search-Point (http://searchpoint.ijs.si/) OntoGen (http://ontogen.ijs.si/) Document-Atlas (http://docatlas.ijs.si/) AnswerArt (http://answerart.net/)

Semantic-Graphs Document-AtlasVideoLectures.NET

Page 4: Videolectures for ocwc2010

Videolectures: Basic facts

10000 videolectures - CC

10000 unique visitors per day

Recorded events 2009: 70, 2868 videos

Shared business models:Research projects

Events

Academic institutions

Baseline funds

In-house developed services with strong support in research in semantics

JSI infrastructure, 5 permanent, 10-15 part time

Goal: Contributing to a global higher ed change by offering open access to high quality scientific material

Page 5: Videolectures for ocwc2010

International dimension

European research supported by the European Commission (from 3M to 10M Euro scale RTD projects)

International institutions: EC, CEEMAN , CERN , Cluster Network , EFMD, IPSA , CLSP, MIT, UC Irvine , Yale, Stanford, TEDx, CMU, University of Ljubljana, Slovenian public research agency…

Active participation in: Opencast, OCWC, EuroCRIS

Knowledge4All foundation

Page 6: Videolectures for ocwc2010

K4AOriginates from Pascal NoE

Knowledge and content exchange networkInspired and lead by most active institutions and organisations around the world from the area of free and open scientific content

Effective and pragmatic

Global impact

Distributed, networked, bottom –up governance

Funds , joint projects

Using existing University networks and resources

Distinctive element: all content to be scientifically approved

Page 7: Videolectures for ocwc2010

K4A - Five pillars of activity

Infrastructure: ICT Matterhorn - Interoperability, Channels, Semantics

Science: Journal and conferencesOnline scientific video journal to global university

Education: courses and contentQuality assurance – peer reviewed content

Research: facilitating the systems, accessing the content, enabling interaction

IPRs, multilinguality, standards

Business models (added value models)

Other continent connections: case study in engagement and interaction

Page 8: Videolectures for ocwc2010

World Summit Award 09

World Summit award 09 “With this, “Videolectures.Net” has approximately outrun 20.000 other products and projects from 157 countries participating in the 4th edition of the WSA, the United Nations based contest for e-content and creativity in the Information Society”.

Page 9: Videolectures for ocwc2010

Technology stack

5 servers serving 20 TB of data

700,000 unique files

300,000 web requests daily (90,000 dynamic)

Application Django software / VideoLectures

Services Nginx, Apache, PostgreSQL, Memcached

Flash Streaming server

Windows Media Server

Cloud storage, Static web hosting

System level Ubuntu Linux Server Windows Server Linux

Servers Web server, Database

Development

Storage, Processing

Flash video streaming

Windows streaming

Amazon S3

Page 10: Videolectures for ocwc2010

Technologies and Research

Graph/Social Network Analysis (GraphGarden/SNAP, IST-World, FPIntelligence)

Complex Data Visualization (DocAtlas, NewsExplorer, SearchPoint)

Computational Linguistics (Enrycher, AnswerArt)

Social Computing/Web2.0 (LiveNetLife)

Decision Support (DEX)

Light-Weight Semantic Technologies(OntoGen, OntoBridge)

Deep Semantics & Reasoning (Cyc)

Statistical Machine Learning

Data/Web/Text/Stream-Mining (TextGarden Suite of tools)

Page 11: Videolectures for ocwc2010

Personalisation

Modeling

Log files

Conte

nt m

inin

g

(Needs and preferences)

Adaptation

Page 12: Videolectures for ocwc2010

Towards personalisation @ videolectures.net

Enrycher(Contextualisation of

content objects)

Quintelligence Miner

(user modeling and segmentation)

Recommender(Content/user matching)

Content/learning object

User behavior

TEL environment(videolectures.net)

Page 13: Videolectures for ocwc2010

User profiling service(Qminer)

Ver1 – identifying segments: developed for NYT, Bloomberg

Ver2 – individual profiling: web service for videolectures.net

Analysing user logs and the content being accessedTextual description – need for transcripts

Contextualisation – need for enriched content

Deep analyticsModeling user behavior

Detecting SIGs – marketing groups, investors,…

Predicting and simulating user’s

Detecting trends in visits

Personalising content and methods

Page 14: Videolectures for ocwc2010

User profiling – identifying segments

Log files

User profiles

Videos articles

Advertisers

Segment Keywords

Machine learners

Text Mining, SVM, Link analysis, Learning, Modeling, Mining,…

Biologists - Arthropoda

Spiders, Mites, Ticks, Crab, Lobster, Shrimp….

… …

QMinerSystem/services

Editors

Authors

Search fieldsSearch field valuesAdd stateNon-persistent QueryGet stateGet statesUpdateRename stateDelete stateChange IndexExit

Page 15: Videolectures for ocwc2010

Recommendation service(Recommender)

Ver1: Developed and tested for videolectures.net

Ver2: Operating at Bloomberg.com also for textual documents

Each video is scored from three directions:Collaborative filtering

Category – VL taxonomy and improved SVM module working on optimized categories

Content – matching video against the user group’s history using all the enriched features

All three scores are combined into final score using weights estimated from the collected training data

Demonstration

Page 16: Videolectures for ocwc2010

Content enrichment(enrycher)

Providing wider context to the document… needed for efficient content mining and modeling

A set of Web services (http://enrycher.ijs.si)

Enriching a document with annotations presenting:Extracted known concepts to the machine

Generated most descriptive sentences and dynamic abstracts

Semantic graph

Descriptions with existing ontologies

Links to the external sources (wikipedia, dmoz, dbpedia, openlink data)

Demonstration

Page 17: Videolectures for ocwc2010

Transcription service(Transcriptor)

Prototype service with automatic rapid vocabulary training of the speech recognition engine using:

Lecture description

Slides information

Videolectures taxonomy

Enriched complementary content

Used for:Transcription

Speech indexing

Video content search

Demonstration

Page 18: Videolectures for ocwc2010

OCWC on videolectures.NET

Videolectures.NET offers to organisations:Low cost service and channel

Unlimited video preservation and fixed urls

Organisation, project and personal videography pages

Access to the back-office editorial and tools

Many innovative viewing and content management features

Sustainable innovation through research projects

Demonstration

Page 19: Videolectures for ocwc2010

Supporting OCWC

Video and courses content distribution through videolectures.net

User modeling and analytics … on a distributed network of OCWC sites

… common access to the analytics services

Opening existing services for independent use… transcription, categorisation, classification, content enrichment

OCWC website on videolectures.net:… crawling, enriching, structuring, categorising distributed materials

… common curriculum support

Page 20: Videolectures for ocwc2010

[email protected] – head of Center for knowledge transfer at JSI

[email protected] – head of videolectures.net service

[email protected] – main editor at videolectures.net

[email protected] – head of the KT research group at JSI

John Shawe -Taylor ([email protected]) – K4A director

Colin de la Higuera ([email protected]) – K4A director

Enrycher: http://enrycher.ijs.si

Recommender: http://videolectures.net

Contextual search: http://searchpoint.ijs.si

Page 21: Videolectures for ocwc2010

Support slides

Page 22: Videolectures for ocwc2010

A movement/competition …

Page 23: Videolectures for ocwc2010

Competitive advantageAccess to lecture rooms and the three most active communities

Videos + slides + comments

Viewing features

Semantically enriched functionalities

Curriculum building and management support

Efficient back-office

Low cost and efficient service from recording to hosting

Page 24: Videolectures for ocwc2010

Answering to challenges?

OpenCourseWare

MIT + >140 Universities

Curriculum, standards, quality of training

OpenCast Berkeley, ETH +

40 top World Universities

OS for video recording at Universities

VL as CDCs

Open CDNVideolectures +

JSI team

Using University Internet links and servers

Knowledge4All foundatio

n

Page 25: Videolectures for ocwc2010

K4A foundersEurope – Pascal2 Network of Excellence:

University College London

Jozef Stefan Institute

University of Bristol

XEROX Research Centre Europe

ETH Zurich

CERN

US:Berkeley + Opencast community

MIT + OCW consortium

AsiaKorea University + Network of South Korean Universities

AfricaVoices of Africa, Kenya + East Africa Universities

Kofi Annan Center for ICT and Development, Ghana + West Africa Universities

Page 26: Videolectures for ocwc2010

K4A - reach

Page 27: Videolectures for ocwc2010

Current developmentOpenCDN – OSS/Collaborative Content Distribution Network

Automatic capturing, enriching, and synchronisation

Deep semantic search through videos

Accessibility, multilinguality

Knowledge extractionSpeech Indexing, Text Mining, Video mining,

Automatic ontology construction,

User Tracking and Profiling.

Page 28: Videolectures for ocwc2010

SCOPE proposal

Page 29: Videolectures for ocwc2010
Page 30: Videolectures for ocwc2010

Visitors

Page 31: Videolectures for ocwc2010

Knowledge 4 All

Page 32: Videolectures for ocwc2010

Expressed interestInternet Society Central America - Mexico

Individual organisations: Trento, ULJ, Zagreb, Southampton, CNRS, VTT, Max Planck, TU Graz, TUB, Oxford, Carlos III de Madrid, UVA,…

Commercial organisations: Springer Verlag, Elsevier Science

Governmental bodies: Slovenia, European Commission

Page 33: Videolectures for ocwc2010

Develo

pm

ent

Rese

arc

hO

pera

tiv

eFree, open access, high quality, scientific content

Systems, standards, interoperability

Didactics, methodics, pedagogical models

Methods (individual, collaborative, business)

Ad

ded v

alu

e

(busi

ness

) m

odels

Em

erg

ing

org

anis

ati

on

models

Inn

ovati

ve

tools

Page 34: Videolectures for ocwc2010

Projects

In preparation:AI Research institute for West Africa: implications for infrastructure, summer schools, course definition, interaction software, etc.

Education kiosks in Africa

Journal SCI registration – also in discussion with Springer about possible publication

Virtual conference

Virtual university

Web 2.5 for learning: support for discussion groups, research communities

Page 35: Videolectures for ocwc2010

Long-term optionsInnovation tube – industry/business use

Virtual universities and virtual programmesBottom-up, distributed, self-organised,

Authoring servicesSupport content enrichment for the content creators

Services:On-the-fly personalisation and recommendation

Video scene recognition, automatic annotation and categorisation

Semantic and multilingual search

Accessibility, Internationalization (subtitles, transcripts)

Advanced presentation services with direct user involvement

Textual, graphical, video (audio) content integration services and enrichment