Upload
micah
View
75
Download
0
Tags:
Embed Size (px)
DESCRIPTION
How to get your hands on the data of your LMS (e.g. Blackboard). Alan Berg Co-chair SIG SURF / Community officer Apereo LAI. Acknowledgements. Resources mentioned in Resources.txt, including: Understanding The Advanced Reporting Database John Knight 2005: - PowerPoint PPT Presentation
Citation preview
How to get your hands on the data of your LMS (e.g. Blackboard) Alan Berg Co-chair SIG SURF / Community officer Apereo LAI
Acknowledgements
Resources mentioned in Resources.txt, including:
Understanding The Advanced Reporting DatabaseJohn Knight 2005:
– 5 relevant slides are copied from this presentation (orange background).
Clip art: https://openclipart.org/
WARNING!
A lot of information will be covered in a short time.
Agenda
Who am IMotivationBlackBoard ProcessesBlackBoard database schemaPulling data out of BlackBoardSimple Analysis in RBrain Storm over possibilities
Wrangler | 07 maand 2014 4
Who am
I have relevant practical experience. Developed building Blocks Integrated stuff with UVA's SIS Run Quality Assurance on a number of large scale projects such as UvA's
Scale out Reported bugs Presented a couple of times (a long time ago) at BlackBoard conferences Old, old stuff: Blackboard 6 usage patterns and implications for the
Universiteit van Amsterdam: http://www.uva.nl/binaries/content/documents/personalpages/b/e/a.m.berg/en/tab-three/tab-three/cpitem%5B41%5D/asset
Pulled out data from BlackBoard for research projects
Motivation
To break down walls that get in the way of getting at your data.
By providing researchers with the details needed to communicate with the people that hold the keys to the data.
By providing system /functional administrators an idea of the processes needed by researchers to use the data
Questions for the audience
Point to a Google doc A quick count of the audience roles A quick question to the audience of what they see as
barriers to their research work.
1
BlackBoard Processes
Motivation: Being specific with processes, filenames and locations so that it is easier to find and do stuff
Take homes
There is a set of best practices that should be followed
BlackBoard files that matter Bb-config.properties Bb-tasks.xml
Scheduled tasks for reporting The process of moving tables to a backup database
Best Practices Accessing the Database
http://www.edugarage.com/display/BBDN/Blackboard+Open+Database+-+OpenDB
Protect your data Work with copies of your data Remember to test performance Use tools to access your databases
Don't reinvent the wheel See whats out in the wild Re-use and improve artefact's
Don't be arrogant: Ask the team that administers the system Look at the Open Database Schema
Questions for the audience
Point to a Google doc Additional best practices for accessing the Database
2
BB-config.properties
Global configuration file for Blackboard. Options include: Database JVM tuning Content Management Server threads Mail servers
Performance depends on tuning this file Named a property file. Follows a specific text format:
bbconfig.unix.httpd.ssl.portnumber=443 Location similar to:
/app/blackboard/config/bb-config.properties Applying changes via a script PushConfigUpdates.bat/.sh
BlackBoard Processes
Code that can run on any BlackBoard Server to get facts
SECURE YOUR DATASOURCETEST AGAINST A COPY OF YOUR DB
1) Create a user that can read BB-config.properties2) Create code that reads BB-config.properties3) Run Code through a scheduled task4) Generate time stamped reoprts or Text dumps
LATER5) Improve code based on requirements
PurgeAccumulator job
Default runs at 1 AM Does three things
Summarizes data Moves data to a backup DB Purges Production pf data older than a certain date Configured via bb-tasks.xml Runs stored procedures in the database
The gruesome details
<task-entry key="bb.stats.purging" version="60">
<task classname="blackboard.platform.tracking.PurgeApplicationTask">
<property name="delay" value="21600000" />
<property name="period" value="86400000" />
<property name="xml.registered.delay" value="1:00" />
<property name="xml.registered.period" value="24" />
<property name="days_to_keep" value="180" />
<property name="target" value="live" />
<property name="dev_null" value="/dev/null" />
<property name="command-line" value="/usr/local/blackboard/tools/admin/PurgeAccumulator.sh" />
</task>
</task-entry>
Schema Description
Number of supporting records are increasing per version
2012 – Partial list
2007 – Full List
Titel van de presentatie | 00 maand 0000
Questions for the audience
Point to a Google doc Based on the 2007 table set which sort of questions can
you ask when combined with SIS?– Promise to look at sending the relevant SQL
statements or publishing to the LASI Utrecht website
3
Building Blocks - More Provides supporting services And JSP (GUI) libraries From experience, permissions structure changes between
versions. Subtle changes in services Database schema is actually more stable. SQL queries more raw but are more efficient than service calls Was easier to learn Building blocks by code completion in IDE
(E.G: Eclipse) than through the documentation Steep learning curve
http://workgroups.clemson.edu/DCIT2803_BBDEV/Tutorial/BBTutorial.pdf
Take home: If you just want the data go directly to the database.Take home: If you just want the data go directly to the database.
Bb-manifest.xml and CodeDon't do this without a safety net
<permission type="java.lang.RuntimePermission" name="db.connection.bb_bb60" actions="connect,accept" />
ConnectionManager conman; int i=0;private Connection bbConnectDatabase() throws ConnectionNotAvailableException, InterruptedException { BbDatabase bbDb = DbUtil.safeGetBbDatabase(); conman = bbDb.getConnectionManager(); while(conn == null && i<10){ try { conn = conman.getConnection(); } catch(ConnectionNotAvailableException cnae){ Thread.sleep(1000); ++i; } } return conn; }
Getting to the database in Perl
use Config::Simple;
use DBI;
$config_file="/home/aberg/Desktop/BB_PILOT_2014/CONFIGURATION/database.cfg";
# Configuration file location
$config = new Config::Simple($config_file);
# Open Database Connection
$dbh = DBI->connect($config->param("ConnectionSettings.dsn"), $config->param("ConnectionSettings.dbuser"), $config-
>param("ConnectionSettings.pass")) or die "Connection Error: $DBI::errstr\n";
generate_user_report();
$dbh->disconnect;
Perl take home
Easy to connect Just different configuration for queries to the archive
database than production or development Perl is a well known language Perl is EXCELLENT for text processing Perl can generate unreadable code and not really a team
language (personal opinion) Great for writing functionality quickly Excellent for manipulating and merging data Understood by many System administrators
Favorite tool – SQL Developer
Everyone has their favorite tools Can export data efficiently Practice SQL statements Known to a wide community What are the ethics and security implications
STOP and ask Hendrik L to reinforce message from his presentation
Questions for the audience
Point to a Google doc Have you any favorite tools for use with databases or
combining data sources?
4
Titel van de presentatie | 00 maand 0000
ASR Schema Elements
Table Name(s) Data Stored Data Rentention
in main schemaACTIVITY_ACCUMULATOR
End user events. Up to 360 days of data
SYSTEM_TRACKING Daily summary stats generated
by summarization task in PurgeAccumulator
process
No limit (never purged)
APPLICATIONCOURSE_MAINCOURSE_ROLESCOURSE_USERSDATA_SOURCEINSTITUTION_ROLESNAVIGATION_ITEMSYSTEM_ROLES,USERSUSER_ROLES
Stores supporting datareferenced byACTIVITY_ACCUMULATOR for ASR Reporting
No limit. Tables are
never purged automatically.
Data Structure
Activity Accumulator Dictionary
Column DescriptionPK1 Primary Key
EVENT_TYPE Type of Event Posted
USER_PK1 Relates to USERS.PK1
COURSE_PK1 Relates to COURSE_MAIN.PK1
GROUP_PK1 Relates to GROUP.PK1
FORUM_PK1 Relates to FORUM_MAIN.PK1
INTERNAL_HANDLE
Relates to NAVIGATION_ITEM.INTERNAL_HANDLE
CONTENT_PK1 Relates to COURSE_CONTENTS.PK1
Activity Accumulator Dictionary
Column DescriptionDATA Additional information dependant on event type.
TIMESTAMP Date/Time the event was posted
STATUS 1 unless error occurred
MESSAGES Error message details
SESSION_ID Relationship to SESSIONS.PK1
REMOVE DATA and MESSAGES as might contain personal information?
Activity Accumulator Events
Event Description Relationships
LOGIN_ATTEMPT A user attempted to login to the system. Captures the session id of the attempted login.
SESSION_ID
LOGOUT A user clicked the logout button
SESSION_IDUSER_PK1
SESSION_INT A user connected to the system and a session was created
SESSION_ID
PAGE_ACCESS A page was accessed SESSION_IDUSER_PK1NAVIGATION_ITEM
COURSE_ACCESS
A page within a course was accessed
SESSION_IDUSER_PK1COURSE_PK1NAVIGATION_ITEM
Data Structure
PK1 are keys to other tables PK1 are effectively random numbers As long as we do not combine with other tables then
pseudo anonymous. Only combine when we have opted in
Data and messages columns may contain personal information.
Question for Hendrik L: How anonymous do we have to be?
5] Question for audience: How secure do we have to be with the data and how long can it live for?
6] Question for the audience: Do we need an audit trail?
Titel van de presentatie | 00 maand 0000
Example: Export in SQL developer
SQL Examples - General
select event_type, count(*) from BackupDB_name.activity_accumulator group by event_type
EVENT 2007 2012
LOGOUT 781618 1174147
SESSION_INT 20875358 20241290
CONTENT_ACCESS 5284051 14205997
COURSE_ACCESS 42266577 57808666
MODULE_ACCESS 15199704 66422352
LOGIN_ATTEMPT 36099987430655
PAGE_ACCESS 1528547 82827763
TAB_ACCESS 9715716 101974
eventTypes2007 <- c (1174147, 20241290, 14205997, 57808666, 66422352, 7430655,
82827763, 101974)
labels <- c ('LOGOUT', 'SESSION_INT', 'CONTENT_ACCESS',
'COURSE_ACCESS','MODULE_ACCESS', 'LOGIN_ATTEMPT', 'PAGE_ACCESS',
'TAB_ACCESS')
pie(eventTypes2007,labels)
R makes life easyWho uses R?
Titel van de presentatie | 00 maand 0000
Generating PDF with LATEX and R
\documentclass{article}\begin{document}\SweaveOpts{concordance=TRUE}<<echo=TRUE, fig=true,keep.source=FALSE>>=eventTypes2007 <- c (1174147, 20241290, 14205997,
57808666, 66422352, 7430655, 82827763, 101974)labels <- c ('LOGOUT', 'SESSION_INT', 'CONTENT_ACCESS',
'COURSE_ACCESS','MODULE_ACCESS', 'LOGIN_ATTEMPT', 'PAGE_ACCESS', 'TAB_ACCESS')
pie(eventTypes2007,labels)@\end{document}
Titel van de presentatie | 00 maand 0000
Why LATEX and R
Can be used to produce papers Turning echo on will show the reviewer the method's
used when analyzing data. Allows the reviewer to test the process with their own
data. Reproducible Standardizing on well known and well documented
practices.
Questions for the audience
Point to a Google doc Is it validate to contemplate a Dutch LA common
approach around R and Latex?
7
Transversing the information
Depends on the membership of your experiment.
Examples:
Whole course opted in: MOOC like conditions An individual user: Individual consent A member of a course Time limited
Whole course opted in
NEED TO FIND THE PRIMARY KEY OF THE COURSE IN COURSE_MAIN
select pk1 from BackupDB.course_main where course_id='20072008.1.testcourseonly'
select count(*) from STATS_2007_2008.activity_accumulator where course_pk1='22144'
select count(unique(user_pk1)), count(unique(session_id)), count(unique(content_pk1)), count(unique(event_type))from BackupDB.activity_accumulator where course_pk1='22144'
We now have the users PK1's. If needed we can relate back to the USERS Table etc.
GOTCHA: Active users
CRSMAIN_PK1 → USERS_PK1
select users_pk1 from STATS_2007_2008.course_users where crsmain_pk1='22144'
An individual user: Individual consent
Users table to find the right PK1Has also got LAST_LOGIN_DATEDangerous as you get to seePasswords
select pk1 from BackupDB.users where user_id='xxxxxx'
STUDENT_ID is descriptive
Reliability depends on provisionprocess
A member of a course
If you know the PK1 of the user and the PK1 of the course then
select * from BackupDB.activity_accumulator where user_pk1='1' and course_pk1='2'
Think in terms of set theoryPka and Pkb and Pkc …..
Time limited
Total activity in the last $delta_days PERL example
$query="select count(*) from $owner.activity_accumulator where course_pk1=\'$course_pk1\' and user_pk1=\'$fact_pk1{$user}\' and timestamp > sysdate - $delta_days";
Tool Specific
Null 200268422
content 31251028
announcements_entry 5082438
my_announcements 1969977
discussion_board_entry 1711946
check_grade 1157869
cp_gradebook2_modify_item 1047877
eph_ephorus-assignment 1046151
course_tools_area 852568
db_thread_list_entry606197
classic_course_catalog 514890
cp_announcements 442061
groups 404074
agroup 382299
staff_information 379582
cp_gradebook2_manage_download_grade 318683
cp_list_modify_users 300262
cp_gradebook 291804
group_file 194884
group_forum138692
cp_gradebook_needs_grading 120446
cp_groups 101073
list_courses 88757
discussion_board 85139
send_email 71699
group_email 67705
classic_club_catalog 65111
view_attempts 64153
cp_send_email 63435
db_collection_entry 63308
select internal_handle, count(*) as TOTAL from stats_2012_2013.activity_accumulator group by internal_handle order by TOTAL DESC
Number of unique handles (at UvA) = 345
SAMPLING BIASAre we ignoring the majority of events that are not associated with an internal handle?
Circling backQuestions for the audience
Point to a Google doc Which sort of questions can you ask when combined
with SIS?– Promise to look at sending the relevant SQL
statements or publishing to the LASI Utrecht website
3
Summary (so far)
Each click is recorded by BlackBoard The activity_accumulator table is the place to be. You now understand the files that are important You now understand the basic BlackBoard Schema Privacy and ethics limits how you search the data Privacy and ethics relate to the PK1's Privacy and ethics relates to which columns you can
export MORE?