21
A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis Dmitry Mouromtsev, Fedor Kozlov, Liubov Kovriguina and Olga Parkhimovich Laboratory ISST @ ITMO University, St. Petersburg, Russia Linked Learning meets LinkedUp: Learning and Education with the Web of Data @ ISWC 2014, Italy

A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Embed Size (px)

DESCRIPTION

The paper describes a combined approach to maintaining an E-Learning ontology in dynamic and changing educational environment. The developed NLP algorithm based on morpho-syntactic patterns is applied for terminology extraction from course tasks that allows to interlink extracted terms with the instances of the system’s ontology whenever some educational materials are changed. These links are used to gather statistics, evaluate quality of lectures' and tasks’ materials, analyse students’ answers to the tasks and detect difficult terminology of the course in general (for the teachers) and its understandability in particular (for every student).

Citation preview

Page 1: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

A Combined Method for E-Learning

Ontology Population based on NLP and

User Activity Analysis

Dmitry Mouromtsev, Fedor Kozlov, Liubov Kovriguina and Olga Parkhimovich

Laboratory ISST @ ITMO University, St. Petersburg, Russia

Linked Learning meets LinkedUp: Learning and Education with the Web of Data @ ISWC 2014, Italy

Page 2: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Introduction

● Use semantics to make education materials

reusable and flexible,

● Different datasets in one e-learning system,

● We need to provide tools for tutors to

improve their courses.

Page 3: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Use students to improve your course!

Page 4: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Goal

● To develop the ontologies,

● To convert tests from XML to RDF,

● To map tests with subject terms via NLP

algorithms,

● To gather user’s statistics and implement

analysis module.

Page 5: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

ECOLE: Front-end● Ontology-based e-learning

system,

● User friendly interface,

● Based on Django framework.

Page 6: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

ECOLE: Back-end● Collection of educational materials

from different open resources

(Dbpedia, BNB),

● Analytics tools,

● Based on the Information Workbench

platform.

Page 7: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

The ontology of education resources

● Extends AIISO

ontology for education

process and structure,

● Uses BIBO for

bibliographic

resources,

● Uses MA-ONT for

media resources.

Page 8: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

The ontology of tests

● Developed to describe the content of

tests,

● Extends top-level ontology of the

system,

● 12 classes,

● 10 object properties,

● 6 datatype properties,

● http://purl.org/ailab/testontology.

Page 9: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

The ontology of student activity

● Developed to store information

about the student's learning

process and results,

● Uses FOAF,

● Uses the ontology of test,● 10 classes,

● 15 object properties,

● 5 datatype properties,

● http://purl.org/ailab/learningresults

.

Page 10: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Lemmatization

● Extracts lemmas from subject term labels,

● Uses NLP procedures and dictionaries to

generate lemmas,

● Stores lemmas in triples.

Page 11: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Terms extraction

Page 12: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

New System Terms● If the extracted word

sequence doesn't match

any of the system terms,

it is included as a new

system term,

● New system terms are

validated via SPARQL

queries to Dbpedia.

Page 13: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

NLP Algorithm

Dictionary

Patterns

Set

Tokenization

Morphological

and Semantic

Tagging

Extracting

Candidate

Terms

Producing the

Canonical Form

of the Term +

Lemma(s)

● Uses Part-of-Speech patterns

● Grammatical agreement is specified in

the patterns

● Words+Lemmas+Morphology+Some

semantic tags

● Superlemmas (a common lemma is

generated for lexical variants)

● Derivation paradigms (a common

lemma is generated for words with the

same root)

Page 14: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Implementation: Test parsers

● Converts tests of the course

from XML to RDF,

● Uses Information Workbench

XMLProvider to automatically

update,

● Describes mapping using

XPath functions.

Page 15: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Implementation: NLP module

● Uses dictionaries in NooJ format:<LEMMA>+<PART OF SPEECH TAG>+

<INFLECTIONAL PARADIGM>+<OTHER ANNOTATIONS>

air,N+FLX=TABLE

Michelson,N+ProperName

● English NooJ resources are reused, Russian lexical

resources are original,

● A separate procedure implemented in Python is

launched for lemmatization and extraction of the terms.

Page 16: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Evaluation and Results

Page 17: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Hidden subject terms

A ladder is 5m long. How far from the base of a

wall should it be placed if it is to reach 4m up

the wall?

Hidden subject terms: Pythagorean Theorem, Hypotenuse, Сathetus…...

Nothing to extract!!!

Page 18: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

User statistics collection

● Collects user's activity

on front-end,

● Creates triples,

● Analyzes user's

answers on tests,

● Builds rating of the

terms, which caused

difficulties for students.

Page 19: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Conclusion

● The ontologies of tests and student activity

have been developed,

● The tasks of the test have been linked with

system terms,

● The statistics gathering module has been

developed,

● The system provides rating of the terms,

which caused difficulties for students.

Page 20: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Future Work

● Improve term extraction procedure by adding parallel

texts of tasks,

● Process units of measure in tasks to predict “hidden

terms”,

● Use relations between subject terms to improve the

quality of term extraction procedure,

● Refine term knowledge rating by replacing it by the

proper ranking formula.

Page 21: A Combined Method for E-Learning Ontology Population based on NLP and User Activity Analysis

Thank you!!!

The front-end of the e-learning system: http://ecole.ifmo.ru

Example of subject terms analytics for module "Interference and

Coherence":

http://openedu.ifmo.ru:8888/resource/Phisics:m_InterferenceAnd

Coherence?analytic=1

The source code: https://github.com/ailabitmo/linked-learning-

solution