De carlo rizk 2010 icelw

Presenter: Paul DeCarlo

Professor: Nouhad Rizk

Date: 6 / 11/ 2010

The Design and Development of an Expert System Prototype for Enhancing Exam Quality

DeCarlo, P., & Rizk, N. (2010). The Design and Development of an Expert System Prototype for Enhancing Exam Quality. International Conference on Electronic Learning in the Workplace.

Introduction

Data mining or knowledge discovery in databases (KDD) is the automatic extraction of implicit and interesting patterns from large data collections (Klosgen & Zytkow, 2002).

Rule discovery is one of the most popular data mining techniques, especially in EDM(Educational Data Mining), because it shows the teacher information that has been discovered and does so in an intuitive way (Romero & Ventura, 2004).

Conventional rule-based expert systems, use human expert knowledge to solve real-world problems that normally would require human intelligence. Expert knowledge is often represented in the form of rules or as data within the computer. Rule-based expert systems have played an important role in modern intelligent systems and their applications in strategic goal setting, planning, design, scheduling, fault monitoring, diagnosis and so on (Abraham, 2005).

Study Purpose

Course evaluations are typically done once per semester at college universities. Furthermore, students who drop a course are usually not considered in these evaluations.

Course evaluations typically do not consider evaluating the efficacy of course materials including assignments, textbooks, review materials, etc.

Evaluations done at intervals, would be able to capture issues as they happen and include students who are intending on dropping. This could inform the educator of what is actually happening in their course instead of providing information after the fact.

Most of the current data mining tools are too complex for educators to use, thus a system which can automate this process to create a human readable evaluations dynamically is of much importance if a system of this type is expected to be adopted by educators.

What is Exam Quality?

Exam quality refers to how well an examination of learned material reflects the information provided in course learning materials. Think of the way validity is defined in research methods.

What is Association Rule Learning? Association rules are a data mining technique used to discover relations

between variables in large example sets.

Support is defined as the probability that an example contains a subset X when randomly chosen from the total set of responses. Support of an association rule ‗A=>B‘ is defined as the ‗support of (A union B)‘.

Confidence refers to the likelihood that for a transaction containing A,

how likely is it that it also contains B. Confidence of an association rule ‗A=>B‘ is defined as the ‗probability that an example contains B given A divided by the probability that an example contains A‘. This is the same as the ‗support of (A union B) divided by the support of (A)‘.

Algorithm developed by Rakesh Agrawal (1993).1. Minimum support is applied to find all frequent itemsets in a database.2. These frequent itemsets and the minimum confidence constraint are

used to form rules.

The rules this technique produces can be interpreted as easily as they can be read. For example, a typical rule may take the form {studies daily} => {has a high GPA}. This would mean that the feature 'studies daily‘ implies a given example {has a high GPA}.

Data Collection Driven by W-CAT model We must have a set of data from which to mine our rules. There has

been research done applying data mining techniques to CMS logs (i.e. Moodle). Our system seeks to include subjective data and requires a survey interface to collect current data. Some of the information we ask in the survey could be gathered from a CMS (homework / exam review completion).

To drive our data collection process and the overall workflow of our system, we used the Witty Cat model developed by Dr. Nouhad Rizk at the University of Houston to guide the creation of our survey.

The Witty Cat Model

Example Survey Questions & Results

•Data gathered using the open-source LimeSurvey online survey software.•The responses can be considered valid, as invitations to the survey are distributed using a secure token system.

The RapidMiner Process Tree•We used the open-source data mining tool, RapidMiner for the generation of our rulesets. This tool allows for visualization and handling of remote databases.•In our initial study (DeCarlo & Rizk 2010), the survey data was cleansed by converting the numerical grades to nominals A-F. These were then converted to binomial data. •Questions which used a 5 point ranking scale were discretized into bins and processed as binomial data. •Frequent Itemsets were generated and we then applied then generated our association rules using a built-in implementation of Agrawal’s Apriori algorithm.

Rules generated in our case study

Results from 50 students enrolled in a College level Computer Organization and DesignCourse at the University of Houston Fall 2009.

Impact of Pilot Study on Instructor Methodology

Our study showed that 62% of students owned the course textbook. Of that 62% only 8% found it useful. This information allowed the instructor to consider teaching more from the text, removing the text completely , or adopting an alternative text. Further inquiry from the students revealed that they were in favor of a better textbook. Specifically one which offered more MIPS programming overview. This was corroborated by 62% of students supporting an increase in programming exercises.

Our rules implied that students who viewed the video->expected to do well on exam2‘ and students who studied primarily using the review video->received F. We can consider the pairing of these rules to imply that the review video instills false confidence in students. I personally inform my students now that there is a review video, but research indicates it may lead to a failing grade if used on its own.

From these examples we can see that a system of this type may be beneficial to both students and instructors.

Development Issues raised in Pilot Study

We need to implement safe guards to protect against meaningless or contradictory rule generation. By seeking pre-defined rules this can cut the computational resources needed to generate our rules and solve both of these issues while remaining adaptive.

Other techniques may prove to be more useful to achieve a dynamic assessment. For example neural networks . A neural network of Moodle data combined with information on the final evaluation of the students has been used to obtain models that permit to predict what students are in situation to surpass a course(Calvo-Florez 2006). Other researchers suggest using a combination of techniques to achieve more interesting results (Romero 2007).

Current state of WittyCat

We have created a stand-alone data collection system which does not rely on LimeSurvey.

We are implementing the automation of the Apriori algorithm on collected data for one-click rule generation.

We are currently developing an inference engine to provide backward chaining driven explanations of the conclusions arrived at in a human-readable format.

Overview of the Desired Final Product

A dynamic, adaptive, specified, intelligent, assessment tool with expert adaptation.

How you can contribute

If you are teaching an online course with examination given at intervals, we can use your data and generate feedback. We are also interested in your subjective critiques of the W-CAT analysis.

To use the system simply visit wittycat.volatileassertion.com and register. Next, watch the instructional video and import your students and course materials as outlined in your syllabus.

Your participation can help us determine the subjective validity of the assessments produced by our tool

You may contact Paul DeCarlo at [email protected] or Dr. Nouhad Rizk at [email protected] for further information.

Suggestions, Questions, Critiques?