INTRO TO DATABASE CLASSES: EXERCISING THE COMPUTING ONTOLOGY Lois Delcambre and Felicia Decker with...

Preview:

Citation preview

INTRO TO DATABASE CLASSES:EXERCISING THE COMPUTING ONTOLOGYLois Delcambre and Felicia DeckerwithDave Maier, Len Shapiro, Rafael Fernandez

1

HIGH LEVEL GOALS Index the topics described in syllabi for DB class

Exercise the Computing Ontology (CO): to describe lecture topicsto describe (lots of small) things – including small-

grained pieces of lectures, assignments, tests, …

Build analysis tools to determine whether different classes covered the same topics, in the same depth, in the same order 2

3

Start of DB portionof the Computing Ontology

4

A bit more of the DB portion of the CO

BACKGROUND WORK FOR FELICIA Learn XML, RDF, OWL Become familiar with the CO Use various OWL tools (Protégé) Find sample data:

Search for DB course syllabi (using CITIDEL, Google)Check to see if slides, assignments, and tests were

available onlineSelect 6 syllabi

5

REPRESENT SYLLABI IN XML W/XML SCHEMA Define initial XML schema

Investigate the use of a tool that would allow forms-based data entry using an XML schema (ultimately didn’t find one to use)

Put the 6 syllabi into the XML schema (iterate)

Prepare an XML summary of each syllabus using XSL

6

7

CS 145 Syllabus – in XML, with our XML Schema

LABEL/INDEX LECTURE TOPICS W/CO TERMS Hand label each topic (as listed on the syllabus)

with one or more CO terms

8

9

CS 145 Syllabus – 1st class, withtopic shown

LABEL/INDEX LECTURE TOPICS W/CO TERMS (CONT.)

Identify list of questions:Terms with multiple pathsMissing termsTerms where Felicia wasn’t sure they were right

Prepare a “vanilla” spreadsheet of all questionsHave 4 DB profs answer the questions/choose termsCompile feedback

10

MULTIPLE PATH EXAMPLE Lecture topic: authorization Ontology term:

Ownership_&_Access_Control_-_Authorization_Techniques

Paths: Information_Topics/Database_Systems/

Components_of_Database_Systems/Database_Administration/ Ownership_&_Access_Control_-_Authorization_Techniques

Information_Topics/Managing_the_Database_Environment/Database_Administration/Ownership_&_Access_Control_-_Authorization_Techniques

11

MULTIPLE PATHS – EXAMPLE 2 Lecture topic: buffer replacement policy Ontology term:

Buffer_Management Paths:

Information_Topics/File_Processing/Buffer_Management

Information_Topics/Database_Systems/Physical_Database_Design/File_Processing/Buffer_Management

12

MULTIPLE PATHS – EXAMPLE 3 Lecture topic: query optimization Ontology term:

Query_Optimization

Paths: Information_Topics/Database_Systems/

Database_Languages/Query_Languages/Query_Optimization Information_Topics/

Storage_and_Retrieval_of_Semistructured_Information/Database_Languages/Query_Languages/Query_Optimization

Programming_Languages/Programming_Language_Classifications/Query_Languages/Query_Optimization 13

(POSSIBLE) MISSING TERMS Embedded SQL Views External Sorting SQL QueriesThis path:

Information_Topics/Database_Systems/Database_Languages/SQL has these children: SQL_Optimization_Techniques SQL_as_DDL SQL_as_DML Stored_Procedures Triggers

14

PRODUCE REPORTS Write a Java program that accepts multiple

syllabi as input (in XML) and produce:3-part summary of each syllabusTopical comparison report – showing common topics

and other topics – across the input syllabiRank comparison report – showing the 2nd through

last syllabi compared to topics listed in order of the first class

Full syllabus report – showing CO terms, with terms that have questions highlighted (in yellow)

15

16

Comparing two syllabi for coverage of topics: common topics highlighted in yellow

17

Unique topics, for each class, shown below

18

Comparing the order of topics covered:Topics listed in order of 1st class

19

Additional topics from 2nd classshown below

LABELING LECTURES – AT FINE GRAIN Hand-label fine-grained bits of lecture slides –

using terms from the CO Hand-label individual questions from a midterm

test using terms from the CO Investigated the use of the Apache POI tool to

extract text from PowerPoint, programmatically Work in progress

20

NOVEMBER 2009 Explain what this project is about Invite people to:

Contribute their DB class syllabus to EnsembleEnter their syllabus into XML Index their topics using the computing ontology

We could use the Syllabus Collection in Citidel We could use the CO in Drupal Write a description/story; build a UI to do the

XML and indexing 21