15

Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Embed Size (px)

Citation preview

Page 1: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders
Page 2: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data

Dr. Richard Price,

Data Scientist,

Planning Services,

Flinders University

Page 3: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

History of The Techniques and Methods of Learning Analytics

• Learning analytics draws upon techniques from a number of established fields:– Statistics– Artificial Intelligence– Machine Learning– Data mining– Social Network Analysis– Text Mining and Web Analytics– Operational Research– Information Visualization

• Application domains such as business intelligence, national security intelligence and learning analytics all have an interest in analysing large volumes of data from disparate data sources and are providing the business cases for the rapid growth in ‘big data’ & data analytics.

• Learning analytics encompasses support to both the business and teaching functions of the learning institution.

Page 4: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Data Types

• Structured data

– Typically stored in databases or spreadsheets, required to be managed in accordance with a standardised storage format and ontology e.g. names, place names,

– E.g. SATAC applications, load, enrolments, FLO usage data

• Unstructured data

– text, audio, imagery, video

– E.g. student email, chat rooms, questionnaire responses, lecture videos (audio & video)

• Different data types lend themselves to different analytical techniques. Unstructured data often requires pre- processing prior to enable structured data analysis

• Unstructured data analysis

– Text : document clustering , topic detection, entity extraction (people, places, locations, dates, times etc., sentiment analysis (+,-)

– Audio : speaker identification, language identification, speech to text, keyword spotting

– Video analysis : face recognition, object recognition, target tracking

Page 5: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Structured Data Analysis

Descriptive statistics – sums, means, std devs, basic plotting (graphs, charts, histograms)

Data visualisation – tools that enable the human to see meaningful patterns in data

Machine learning -tools that enable computers to find patterns in data to perform either classification, clustering or predictione.g. decision trees, neural networks, support vector machines, linear regression, self organising maps, k-means

Predictive analytics –Algorithmic approaches (generally machine learning) for predicting key target variables of interest.

Example LA projects: Identification of ‘at risk’ students - Student Success Project, Future University enrolments, topic enrolments

Page 6: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Data Visualisation

Structured Data Unstructured Data

Page 7: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Combining Structured & Unstructured Data SourcesAdvanced Data Visualisation

Page 8: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Predicting Enrolments From Applications Data

• Aim: To predict next year’s commencing load using past 3 years of SATAC applications data.

• Predictions based at the applicant level – not time series based.

• Adopted a decision tree machine learning based approach.

• Input variables for each applicant included: academic performance, schooling, demographics (e.g. age, gender and postcode), information regarding each of the applicant’s preferences such as; preference number, course, institution, institution campus and a number of proximity variables.

• Output (target) variable : whether the student was enrolled at Flinders University at Semester 1 census.

Page 9: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Predicting Enrolments From Applications Data • The three P’s - Prestige, Proximity & Price• Proximity input variables

– For two given points P1= (lat1, lon1), P2 = (lat2, lon2) the haversine distance in kilometres between P1 and P2 is defined as:

d(P1,P2) = ACOS(SIN(lat1)*SIN(lat2) + COS(lat1)*COS(lat2)*COS(lon2-lon1) ) * 6371

– Haversine distance calculated between applicant’s primary residence and all SA major University campuses, with each value being an input into the machine learning algorithm.

• Two models developed, a) from 1st week in September b) from 2nd week in January.

• Training data consisted of 3 years of data 2011, 2012 & 2013 to predict 2014 enrolments - 25,551 training examples for September and 74,516 for January.

• A number of commonly used machine learning algorithms could have been used, we chose to adopt a CHAID decision tree algorithm.

Page 10: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Predicting Enrolments From Applications Data

• Results :

• Lift Versus Output Percentile Profiles For the September Model

Training Validation

Model Number Of Applicants

(Predictions)

Predicted Commencing

Load

Actual Commencing

Load

% Error

September 8557 1394 1365 2.1January 26457 4340 3858 12.5

Page 11: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Predicting Enrolments From Applications Data

• The strong consistency of the lift profiles between training results and test and validation results are indicative of structural patterns of behavior that appear to exist across applicants to South Australian Universities.

• These patterns of behaviour appear to be being captured via the rules contained within the decision trees produced during the training stage of the modeling process.

• Paper reporting this work accepted for presentation at the Australian Association for Institutional Research Forum in November & possible publication in the Journal of Institutional Research.

• If future year’s performance proves to be similar, the approach should be able to provide valuable support to the management of the applications process.

Page 12: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Predicting Topic Enrolments

• Planning services approached by School of Nursing to predict future topic enrolments to assist in resource and placement management.

• Primary focus on predicting topic enrolments for 2nd year undergraduate nursing topics.

• Largely deterministic program complicated by pre-requisites, large numbers of advance credit 2nd year commencers, relatively high percentages of part-time students and a lack of historical training data due to a major course restructure in 2013.

Page 13: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

Predicting Topic Enrolments • Similar machine learning (decision tree) approach adopted however input variables

consisted only of: course code, attendance type, and previous topics passed (no student demographic or BOA information).

• Binary target variable - 1 did enroll in target topic, 0 did not enroll in target topic

• Under new program 2nd year topics being run for first time in 2014. Therefore only have 1st year 2013 students to train and test on. Test results gave promising results and a model was developed to predict topic enrolments for 2015.

• Predictions for all seven 2nd year nursing topics were provided and validated by the School as being consistent with their estimates.

• The School of Nursing have requested for the approach to become part of their standard business process in future years and discussions are underway as to how Planning Services can meet this request.

• School of Education, Humanities and Law have provided 12 topics of interest to assist planning services further develop the approach within a less constrained course structure.

Page 14: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders

In Conclusion• Learning analytics is still in its infancy.

• The Student Success Project, Topic Enrolment and University Enrolment Prediction projects have demonstrated some early promise.

• Across the University we have the technical expertise and strong management support to progress learning analytics at Flinders.

• Particularly keen to work with the faculties to progress analytics in support of the teaching function.

• Performing research-like activities within an operational environment – looking for trailblazers without the fear of failure.

• We’re keen, enthusiastic and we’re here to help !

Page 15: Learning Analytics – Tools & Techniques For Analysing Large Volumes Of Educational Data Dr. Richard Price, Data Scientist, Planning Services, Flinders