Upload
lamkhanh
View
228
Download
4
Embed Size (px)
Citation preview
1
Knowledge Elicitation Plug-in for Protégé - Card Sorting and Laddering
A thesis submitted to the University of Manchester for the degree of Master
of Science in the Faculty of Science and Engineering
2005
Yimin Wang
School of Computer Science
2
List of Contents
List of Contents....................................................................................... 2
List of Figures......................................................................................... 5
Abstract ................................................................................................... 6
Declaration.............................................................................................. 7
Copyright Statement ............................................................................... 8
Acknowledgement .................................................................................. 9
1. Introduction....................................................................................... 10
1.1 Motivation ...................................................................................................10
1.2 Approach .....................................................................................................12
1.3 Thesis Outline..............................................................................................13
2. Knowledge Elicitation ...................................................................... 14
2.1 Overview .....................................................................................................14
2.1.1 History ............................................................................................................... 14
2.1.2 Traditional Knowledge Elicitation Methods ..................................................... 15
2.2 Card Sorting.................................................................................................17
2.3 Laddering.....................................................................................................19
2.4 AKT Project.................................................................................................20
2.5 PCPACK Toolkits .......................................................................................24
3. The Semantic Web............................................................................ 26
3.1 The Semantic Web Overview......................................................................26
3.2 Ontology and Ontology Engineering...........................................................28
3.2.1 Introduction ....................................................................................................... 28
3.2.2 Ontology Engineering ....................................................................................... 29
3.3 The CO-ODE Project ..................................................................................30
3.4 Protégé.........................................................................................................33
4. User-centered Design........................................................................ 35
4.1 Introduction to User-centered Design .........................................................35
4.2 Plug-in Design Principles ............................................................................37
4.3 Interview for User Participant Design.........................................................39
4.3.1 Unstructured Interview Design ......................................................................... 39
4.3.2 Structured Interview Design.............................................................................. 39
3
4.4 Analysis and Conclusion .............................................................................41
4.4.1 Unstructured Interview...................................................................................... 42
4.4.2 Structured Interview .......................................................................................... 42
4.5 Conclusion...................................................................................................44
5. Implementation of Knowledge Elicitation Plug-in........................... 46
5.1 Scope and Requirements .............................................................................46
5.2 Designing Issues and Structure ...................................................................48
5.2.1 Case Study of Interviews ................................................................................... 48
5.2.2 Requirement Analysis ........................................................................................ 49
5.2.3 System Structure Design.................................................................................... 53
5.2.4 Software Development Platform........................................................................ 54
5.2.5 Output Format................................................................................................... 55
5.3 Plug-in for Protégé ......................................................................................56
5.3.1 Common Technical Requirements ..................................................................... 56
5.3.2 Integration Method............................................................................................ 56
5.4 Implementation............................................................................................57
5.4.1 Implement Concepts .......................................................................................... 57
5.4.2 Software Java Classes ....................................................................................... 59
5.5 Conclusion...................................................................................................65
6. Software Testing and User Evaluation ............................................. 66
6.1 User Evaluation Methodology.....................................................................66
6.1.1 Software Setup................................................................................................... 66
6.1.2 Elicit Knowledge using card sorting ................................................................. 66
6.1.3 Laddering the Result.......................................................................................... 69
6.1.4 Using the transaction manager ......................................................................... 70
6.2 User Evaluation ...........................................................................................71
6.2.1 Interface Evaluation .......................................................................................... 72
6.2.2 Functional Evaluation ....................................................................................... 72
6.3 Evaluation Result Analysis and Conclusion................................................73
7. Conclusion and Future Works .......................................................... 75
7.1 General Conclusion .....................................................................................75
7.2 Future Work.................................................................................................76
References............................................................................................. 78
Appendix............................................................................................... 85
A. The Interviewees’ Profile .............................................................................85
4
B. User Testing and Evaluation Results ............................................................86
B.1 Results from Knowledge Management Community ............................................. 86
B.2 Results from Computer Science Students ............................................................. 87
B.3 Results from Business Student.............................................................................. 88
C. Software Setup..............................................................................................89
5
List of Figures Figure 2. 1: The Traditional card sorting (Nielsen, 1995)........................................18
Figure 2. 2: The Snapshot of Adaptiva System........................................................22
Figure 2. 3: The COHSE Structure...........................................................................23
Figure 2. 4: Amilcare System Snapshot ...................................................................23
Figure 2. 5: The PCPACK System Structure ...........................................................24
Figure 3. 1: The Classic Cake Diagram....................................................................26
Figure 3. 2: OWLViz Snapshot ................................................................................31
Figure 3. 3: Protégé Wizards Snapshot ....................................................................31
Figure 3. 4: OWLDoc Snapshot ...............................................................................32
Figure 3. 5: The Manchester Pizza Finder Snapshot ................................................33
Figure 3. 6: An Example of OWL Syntax................................................................34
Figure 4. 1: An Example of User Participant Design Activities ..............................37
Figure 5. 1: System Structure Diagram ....................................................................54
Figure 5. 2: Sample Code of Using Protégé API......................................................57
Figure 5. 3: A Diagram of Workflow Prototype ......................................................59
Figure 5. 4: Java Class JTransaction UML Diagram ...............................................60
Figure 5. 5: Java Class KE UML Diagram...............................................................61
Figure 5. 6: Java Class JCardSorting UML Diagram...............................................62
Figure 5. 7: Java Class JDocElicitation UML Diagram...........................................63
Figure 5. 8: Java Class JLadderingUML Diagram...................................................64
Figure 6. 1: The Document Elicitation Frame..........................................................67
Figure 6. 2: Card Sorting Tool .................................................................................68
Figure 6. 3: Card Sorting and Laddering Tool Appears Simultaneously .................69
Figure 6. 4: Laddering Tool......................................................................................70
Figure A. 1: Snapshot of Choosing KEToolTab ......................................................90
6
Abstract The next generation of Web is expected to be the Semantic Web. Ontologies have
been widely accepted as the primary method of representing knowledge in the
Semantic Web. Knowledge elicitation is usually the first step in building ontologies.
A number of knowledge elicitation toolkits such as Protégé have been developed to
assist users in this process. However, traditional knowledge elicitation techniques,
such as card sorting and laddering, are performed manually and therefore lack the
potential efficiency and correctness of automated or partially automated approaches.
In this thesis we implement a plug-in for Protégé that allows graphically eliciting
knowledge from document using card sorting and laddering approaches, hereby
promoting the process of building and maintaining ontologies. There is also an
opportunity to employ user-centred design principles to make user involved in the
design session that might be noticeably helpful while implementing the software
user interface. Furthermore, the future research will benefit from user testing and
evaluation. The current feedback from the user evaluation procedure shows that the
knowledge elicitation plug-in for Protégé developed in this project has already met
many of the users’ expectations and indeed saves users considerable time in their
daily work.
7
Declaration No portion of the work referred to in the thesis has been submitted in support of an
application for another degree or qualification of this or any other university or
other institute of learning.
8
Copyright Statement 1) Copyright in text of this thesis rests with the Author. Copies (by any process)
either in full, or of extracts, may be made only in accordance with instructions
given by the Author and lodged in the John Rylands University Library of
Manchester. Details may be obtained from the Librarian. This page must form
part of any such copies made. Further copies (by any process) of copies made
in accordance with such instructions may not be made without the permission
(in writing) of the Author.
2) The ownership of any intellectual property rights which may be described in
this thesis is vested in the University of Manchester, subject to any prior
agreement to the contrary, and may not be made available for use by third
parties without the written permission of the University, which will prescribe
the terms and conditions of any such agreement.
3) Further information on the conditions under which disclosures and exploitation
may take place is available from the Head of the School of Computer Science.
9
Acknowledgement I would like to express my appreciation to Prof. Alan Rector and Dr. Robert
Stevens from the School of Computer in University of Manchester for their sound
supervision, theoretical and practical guidance throughout this M.Sc. project. I also
want to express my special gratitude to those who have participated in the
user-centered design activities, including the pre-implementation design, software
testing and evaluation. They are Ms. Yiwen Zhu, Mr. Peihong Ke, Mr. Yun Zhang,
Mr. Dominic Matchett, Mr. Ian Pettman, Mr. Kearon McNicol, Mr. Matthew
Horridge and the other four anonymous users in W3China.org BBS forum system.
Thanks again for their kind help and valuable suggestions. Finally, I am sincerely
grateful for the support from my parents, Mr. Xiaohua Wang and Ms. Shuhua Chen.
Thank you for all.
10
1. Introduction 1.1 Motivation In today’s world, big organizations, like international enterprises or universities,
find it difficult to manage the large amount of documents and get knowledge from
them. Finding the accurate information effectively becomes an increasingly
noticeable topic, and especially while the information is mainly contained in the
internet-based documents. Many real cases show that the existing search engines
are not satisfiable in locating the information which has many different meanings in
different domains and people are easily getting confused while facing the versatile
sources of knowledge.
One example of the issues discussed in this thesis is the following. A philosopher
wants to search the state-of-the-art of the ontology concept development in the
philosophy domain and put the term “ontology” in an entry of search engine. While
the “enter” key is pressed, the philosopher will find the contents mainly focus on
the Semantic Web research, which is one of the most active interdisciplinary
research topics recently. Assume that our philosopher is so skilful in using search
engines that he can guess the key words which are best-fit to the proposed search
results. He inputs “ontology in philosophy” and finally gets a sort of
semi-philosopher-oriented result, which is still less philosophical. Another example
is the people’s name search problem - nearly everyone has tried to search his/her
name on internet and often, for those who are not famous enough, they will be
dizzy about the result and surprised by the number of people who share a same
name.
Semantically free documents will lead to the problems aforementioned, and the
scientists find out that the Semantic Web technologies will provide a solution by
generating semantically annotated documents. The terms in the documents have
their concepts and relationships, which are described by the ontologies, so that the
ontology-based Semantic Web query can eliminate the drawbacks of the key
word-based search, which might misunderstand the goal of search tasks. Ontology
is the essential component of such powerful Semantic Web tooling. Basically, now,
ontology is not only a philosophical term, but also widely cited by computer
11
scientist, especially by people from knowledge engineering community and Web
community, which emerge to form the Semantic Web community.
Building ontology usually includes knowledge elicitation as the first step, which is
also known as an important branch of knowledge acquisition. Traditional
knowledge elicitation is a kind of labour-intensive manual work and extremely
time-consuming, so more usable and handy toolkits for building ontology are
highly needed. Protégé (Noy, Sintek, Decker, Crubezy, Fergerson, Musen, 2001) is
the one of the most popular ontology editors (Lambrix, Habbouche, Pérez, 2002),
enabling user to create ontologies by defining the concepts, specifications,
relationships, annotations and other information of terms in a certain domain.
To do so, domain experts need to be able to visualize and manipulate their ideas
thoroughly and flexibly before they structure it in the Protégé system. Several
standard knowledge acquisition/elicitation techniques, such as repertory grid and
laddering, have been developed to help in organising domain experts’ ideas into
basic structures and to recover tacit knowledge. Card sorting have been used for
several decades, and was systematically formalised by Rugg and McGeorge during
1990s, and it is remarkably useful for finding out how people categorise things
(Rugg & McGeoge, 1997). Laddering was firstly introduced by Hinkle (1965), a
clinical psychologist, in order to model the concepts and beliefs of people and by an
unambiguous and systematic approach. In the field of market research, laddering
plays an essential part in evaluating people’s concepts of goals and values. Most of
these knowledge acquisition/elicitation techniques are visual or graphical. But the
traditional card sorting and laddering methods are extremely difficult to be
managed and tracked back - you will find it is nearly impossible to keep the record
for hundreds of cards or paper pieces and go back to previous status without a
complicated series of actions, such as video tape recording, searching and playing.
We do need some computer-aided knowledge elicitation tools to automate those
activities.
The goal of this thesis and its corresponding projects will be to develop a Protégé
plug-in to help people in this process. Otherwise, there is an opportunity for a
graphical direct manipulation interface and user-centered design, i.e. interviews are
12
required to collect the users’ manners information while they are eliciting
knowledge.
1.2 Approach This thesis introduces a straightforward framework for building a knowledge
elicitation tool as Protégé plug-in. The approach combines classical methodologies
for manual knowledge elicitation activities with support of a heuristic toolkit. Two
methods for knowledge elicitation are applied in this toolkit. The first one is card
sorting. This step is supported by manually retrieving interested concepts from a set
of domain texts, which come from the selection of domain experts. The second
elicitation approach, laddering, takes a completed card sorting results as a basic
vocabulary reference set, and the users convert it into a bottom-up representation by
adopting some pre-defined or user-defined relationships. Alternatively, users may
directly use an existing pile of terms to start their laddering process. Heuristic
questions defined with each relationship will appear to help user in organising the
hierarchical structure of the laddering session. The results of these steps are
assessed to assemble a first version of the output information, which is then
accessible by this plug-in or Protégé system for future development. The format of
this output file will be one of the design concerns, and the pros and cons will be
also discussed. The output can eventually be extended and converted to the domain
ontologies by possibly, an automatic ontology generator.
User-centered design methods were used to interview the real users to collect the
traditional or intuitive manners of using card sorting and laddering, by which the
protocols between the users and the software system are established. Furthermore,
the user-defined relationships in laddering tool are other applications for
user-centered design techniques.
In collaboration with the CO-ODE project (Rector, 2002) of Medical Informatics
group in University of Manchester, the main focus of this thesis is to develop a
plug-in for building the knowledge elicitation environment in a widely used
software system. The process of building such a plug-in has been applied to build
the card sorting and laddering plug-in for Protégé. Within this context, the
13
conversion of a human concept into a machine readable output document and
evaluation of two knowledge elicitation tools which especially, constitute the
central parts of the scientific research work. The first evaluation is to get software
testing feedback from domain experts and other potential users, then it will be
analysed and evaluated in comparison with the traditional card sorting and
laddering actions.
1.3 Thesis Outline Chapter 2 - 4 provide the theoretical background for this thesis, in which the second
chapter introduces traditional knowledge elicitation techniques, including the
limitation of these methods, and two recent applications of knowledge elicitation
techniques, the AKT project and PCPACK toolkits are also briefly investigated. In
Chapter 3, a referral of the Semantic Web technology is given, comprising the
theory of ontology engineering approaches and their various representations. It also
gives an overview about the Protégé system the CO-ODE project, which provides a
broader context within which the research work of this thesis is embedded. Chapter
4 describes the basic concepts and principles of user-centered design and the
contriving of user interviews for collecting guidelines to framing the plug-in.
In Chapter 5 the application of the entire plug-in design session will be planned in
terms of the above-mentioned project to establish a plug-in for the popular software
system - Protégé. The conversional procedure of human conceptual structure to
machine-based output is discussed there in detail.
Chapters 6 show the result and analysis of the testing and evaluation of this plug-in
implemented by using collaborative design methods, constituting parts of the
framework. Finally, Chapter 7 summarise the findings of this project and offers an
overview for future work and complementary research.
14
2. Knowledge Elicitation Knowledge is everywhere. It could be the texts in the books, the web-based
documents, the sound from cassettes or the video recorded on DVD disc. Many
people in the Knowledge Management community are working on how to manage
the various kinds of knowledge with effectiveness, efficiency and correctness,
whereas sometimes they are not very successful in doing this. Why? Because
knowledge is too versatile to be easily handled, e.g. it is quite difficult to find out a
specific sentence in a book in the library, although we know this sentence should be
there.
Things are getting changed after the growing popularity of personal computer,
obviously, because its storage system makes people easy to save their books by
converting them into digital version - although usually typing is a labour-intensive
work. Now we have numerous sources of texts over the internet, from which we
elicit knowledge to acquire the information we are interested or expertise in.
Knowledge acquisition is a process which extracts knowledge from sources of
expertise and transfers it into knowledge base. Knowledge elicitation is the most
important branch of knowledge acquisition, obtaining knowledge from a human
domain expert for use in a specific area (Cooke, 1994).
2.1 Overview As knowledge elicitation is usually the first step to build ontologies, it becomes the
first thing to be investigated in this thesis.
2.1.1 History From mid 1980s, people began to do research on expert systems as a sub-discipline
of knowledge engineering, and it was also the starting point of scientific research
on knowledge elicitation. Expert systems are built to help people in problem
solving and decision making processes and they have specific knowledge in certain
domain which is acquired from human domain experts. To solve expert problems, it
is getting clear that it could not be done by a common strategy, as much as to some
domain issues (Glaser & Chi, 1988).
15
Knowledge elicitation is not easy to implement, because:
Experts are normally busy and difficult to find
Experts may have different points of view of knowledge
Uses of knowledge vary based on different background of experts.
It is reasonable to make it unambiguous that the domain experts’ common sense, as
well as their origin and explanations of different knowledge. The knowledge in a
certain domain includes:
Domain concepts
Specification of concepts
Relationship between concepts
Related activities about concepts
Thus, people begin to try to develop knowledge elicitation techniques to get
knowledge with effectiveness, efficiency and correctness. A number of these
methods are borrowed from cognitive science and other disciplines such as
anthropology, ethnography, and business administration (Boose & Gaines, 1988;
1990; Hoffman, 1987). In the mean time, other applications, including computer
interface design, agent system, e-learning system, began to use knowledge
elicitation methods to enhance the functionality of their software system. Taking
Human-Computer Interaction and human factors design (e.g., Benysh, Koubek, &
Calvez, 1993) for example, knowledge elicitation techniques are effectively used in
early 1990s, with the popularity of graphical based personal computer system
(Shadbolt & Burton, 1995).
Knowledge elicitation has its new additional aspects from the influence of the
Semantic Web technology on knowledge engineering in early 2000s, which will be
discussed in Chapter 3 of this thesis.
2.1.2 Traditional Knowledge Elicitation Methods There are four categories of knowledge elicitation methods identified and briefly
described. Within each group there are a number of knowledge elicitation methods
and variations on individual methods (Cooke, 1994).
16
Observation
Knowledge elicitation usually starts with observations of tasks within the
domain of expertise. Observations can provide an overall impression of the
specific domain, can help people to generate initial concepts of the domain, and
identify any issues to be dealt with during later phases of knowledge elicitation.
Observations can occur in the natural process, thus providing preliminary
glimpses of actual behavior that can be used for future tasks and other resource
for potential knowledge elicitation activities.
Interviews
It is the most straightforward to directly ask someone to know something. This
is a kind of unstructured interview, the most frequently application of all the
eliciting methods in 1980s (Cullen & Bryman, 1988). Unlike the free and open
unstructured interview, structured interviews have pre-determined contents or
orders. The two kinds of interview have their own proper purposes and should
be used in different scenario.
Process Tracing
Process tracing includes a collection of sequential behaviours and the analysis
of the outcome event protocols so that the inferences can be deduced from
underlying cognitive procedures. Therefore, these techniques are usually used
to elicit session-based information, for instance, conditional rules used in
making decisions, or the sequence to which various cues are eventually
attended.
Conceptual Methods
Conceptual methods elicit conceptual structure in the domain specified
knowledge and their inter-connections. Several stages are normally required
and each of them is associated with a variety of approaches. The stages are:
a) The elicitation of classes or concepts through interviews or analysis of
documentation.
b) The empirical concepts relationship from experts.
c) Eliminate the redundancy the concepts.
d) Interpretation of the output conceptualisation.
17
The four groups of knowledge elicitation methods embrace the major current
existing techniques. However, new approaches are continuously being developed
for a specified usage or other purposes. In a nutshell, the traditional knowledge
elicitation methods are the origin of the more specified techniques such as laddering
and card sorting that will be addressed in detail in the following sections.
2.2 Card Sorting Card sorting is a comprehensive technique of knowledge elicitation and now is
being used in several disciplines such as knowledge engineering, Psychology, and
Marketing. In the field of knowledge elicitation, card sorting is considered to be one
of the most effective ways for eliciting the domain experts’ idea about the
knowledge structure.
The traditional card sorting method generally consists of a pile of cards with size of
credit card, created by the researchers, who write or print the domain concepts on
cards. The domain experts or other users sort the cards to piles or groups and
describe the reasons and criteria of the way of sorting. Video tape recording both
the acts and voices of the entire procedure is the best approach for future analysis
because it is the most convenience way to track back, although it is somewhat
complicated to get the equipments prepared.
Many evidences show that card sorting has a lot of positive aspects in making a
useful and reasonable elicitation experiment, including helping the respondents to
recall the domain concepts; identifying the problems with different level;
discovering the feedback from different groups of people; providing a structuralised,
tree-like concepts pile for future processing, like laddering; fast acting and easy
handling (Rugg, Corbridge, Major, Shadbolt & Burton, 1992; Rugg, McGeorge,
1997; Nurmuliani, Zowghi & Williams, 2004).
The diagram below shows a snapshot of a typical card sorting performed manually,
in which we can find a general concept about how people sort the cards.
18
Figure 2. 1: The Traditional card sorting (Nielsen, 1995)
Nevertheless, the drawbacks of manual card sorting are also clear:
Easy to be destroyed
Imagine if a blast of wind blows the window agape while we are sorting, the
order of the cards will be disarranged, especially if the piles consist of hundreds
of cards. Although the card sorting can be performed in a windtight office, a
cup of coffee might cause the same result.
Difficult to be managed
If we record the whole procedure by the video tape recorder, it is difficult to
find the tape directly without watching the contents, because the information on
side strip of the tape is usually not sufficient enough. Consider when the
researchers want to find a specific task performed on an unknown date, they
have to search the tape from the start to the end and try another one if the
current tape is not the want.
Transfer bottle-neck
Basically, the video file is much difficult to be handled. It is time consuming to
transfer the video information from tapes to PC and compress it to a small size
file, or just interchange the large size video file via internet. Make a duplicate
of the tape and send it as a postal parcel could be done but an extra tape
19
duplicator will be required and unexpectable exception which makes the tape
unreadable may occur while delivering.
In this thesis, a computer based card sorting tool will be introduced and the
problems aforecited are completely solved. With computers involved with back-up
copies of documents, people are no longer afraid of the physical violation caused by
the environment. Transaction based management mechanism makes the task
handling straightforward and the documents are comparatively small size,
syntax-based text files with ease of delivery via internet. A detailed introduction,
design implementation and evaluation of this software system will be given in
Chapter 5 and 6.
2.3 Laddering Laddering was first introduced by Hinkle (1965) as an approach of eliciting the
conceptualised and structuralised person’s ideas in a straightforward and systematic
mean. Based on the Personal Construct Theory (Kelly, 1955), Hinkle mainly
focused on the clinical psychology. Thereafter, laddering was well-developed to be
used in market research to discover the goals and values of the customers in
choosing the different brands of products (e.g. Reynolds & Gutman, 1988; Wansink,
2003).
Recently, laddering were widely used in the field of knowledge elicitation, with the
increasing popularity of knowledge engineering and expert system research,
whereas the purposes of eliciting people’s goals and values remained the same (e.g.
Corbridge, Rugg, Major, Shadbolt & Burton, 1994; Rugg, Eva, Mahmood, Rehman,
Andrews & Davies, 2002). The people from knowledge elicitation community have
developed a well-established range of formal semantics, procedures and notation
for building ladders. But obviously the Requirements Engineering in this field has
different and broader theoretical foundations than in clinical psychology and in
market research (Rugg & McGeorge 2002).
Ontology development is normally starting with the knowledge elicitation phase
therefore laddering techniques plays an important role in discovering the potential
20
relationships between the domain concepts. The laddering method is usually used
combining with other knowledge elicitation methods such as card sorting. The
subjects and objects within the ontology are inter-connected with several kinds of
relationships elicited from the domain experts via laddering, and the structural
source of subjects and objects are built via card sorting. As ontology is the
structuralised domain knowledge base from experts, we can realise that laddering is
undoubtedly essential while developing ontologies.
Based on the Rugg and McGeorge’s (1995; 2002) categorisation, the laddering can
be used for three major purposes.
laddering to elicit sub-classes
Taking fruit concepts for example, when the system gives the hints to users by
telling them to give some subclasses for “fruit”, the users might say “apple”
and “orange”. This question continues and the user says “granny smith” and
“royal gala” as the subclasses of “apple”. The questioning system interactively
gives hints to the users and help user to build the ladders.
laddering to elicit explanations
If the problem here is to elicit the explanations from the users, or interviewees,
the questioning system will ask “Which one do you prefer, apple or meat?”,
and then record down the feedback from the users as the explanation of their
choices.
laddering to elicit goals and values
At this stage the system have the fundamental structure and explanations of the
concepts, so it is possible to discover the goals and values for each user. If user
chooses the apple, or always prefers vegetables or fruit, he might be a
vegetarian; oppositely, if he loves meat more, he should be a meat-fan.
2.4 AKT Project The big AKT project is a collaborative framework between five internationally
recognised UK universities, and it aims to develop and extend a range of
21
technologies providing integrated methods and services for the capture, modelling,
publishing, reuse and management of knowledge (Shadbolt, 2003). The AKT
project consists of six major aspects in the field of knowledge technologies, and
they are interrelated with each other with various research topics. The six aspects
are:
knowledge acquisition
Knowledge Modelling
Knowledge Retrieval
Knowledge Reuse
Knowledge Publishing
Knowledge Maintenance
In this thesis, we are more concerning about the knowledge acquisition
technologies because knowledge elicitation is a major sub-field of knowledge
acquisition. There are many research works with different emphasises within the
knowledge acquisition section of AKT project, in which three projects are tightly
related to this thesis and will be introduced below.
The first one is Adaptiva, A user-centered ontology building environment, based on
using multiple strategies to construct ontology, minimising user input by using
adaptive information extraction (Brewster, Ciravegna & Wilks, 2002; 2003).
Potentially, this project will help building the document elicitation tool by
enhancing its functionality.
22
Figure 2. 2: The Snapshot of Adaptiva System
Secondly, the COHSE project researches methods to improve significantly the
quality, consistency and breadth of linking of WWW documents at retrieval and
authoring time (Carr, Bechhofer, Goble, Hall, 2001; Bechhofer, Goble, Carr &
Kampa, 2002). COHSE project could be used to annotate the web documents to
help users in eliciting the concepts from the documents directly on popular web
browsers (e.g. Firefox, 2005) via a proxy server, which are used to pre-process to
webpages.
23
Figure 2. 3: The COHSE Structure
The third one is an adaptive information extraction tool designed to support
document annotation for the Semantic Web, called Amilcare (Ciravegna, 2001). It
gives a new algorithm to automatically extract knowledge from documents with
comparatively high correctness and efficiency.
Figure 2. 4: Amilcare System Snapshot
24
In all, the three projects provide new approaches in helping the users to elicit
knowledge from various sources of documents, including pure text, hyper text and
other web documents, which are also the major information resource of the plug-in
introduced in this dissertation.
2.5 PCPACK Toolkits There is a powerful toolkit called PCPACK for modelling, distributing,
management and reuse of knowledge within business context. This toolkit has been
published for over 10 years and many global enterprises have used this system and
given the feedbacks for the future development. With the growing popularity of
new technologies and market demands, for instance, the integration of
commonKADS method (Schreiber, Akkermans, Anjewierden, de Hoog, Shadbolt,
van de Velde & Wielinga, 2000), the PCPACK has released several updated
versions and now it becomes one of the most popular knowledge engineering
toolkits around the world.
The functionalities of PCPACK are mainly capturing, structuring, validating and
reusing knowledge, and implemented with several independent, but connected tools,
including Ladder Tool, Matrix Tool, Annotation Tool, Diagram Tool, Protocol Tool,
Publisher Tool, Diagram Template Tool (Milton, 2005).
Figure 2. 5: The PCPACK System Structure
The Ladder Tool in PCPACK enables the user to build various hierarchies of
25
knowledge. The laddering tool in the knowledge elicitation plug-in software system
introduced in this thesis is not a simple transplant of the Ladder Tool in the
PCPACK whereas they have a shared theoretical background about laddering
method. The laddering tool in this knowledge elicitation plug-in is developed
purely based on the results of practical interviews participated by a certain number
of domain experts and potential users, using empirical approach on performing the
laddering method. Furthermore, combining the card sorting and laddering tool is
also a reasonable implementation because many domain experts are using the result
of card sorting while doing laddering. The detailed information of the experiments
and interviews for designing this knowledge elicitation plug-in, with emphasis of
user-centered design and user-participant design will be discussed in the Chapter 4
and also in Appendix.
26
3. The Semantic Web This chapter introduces the overview of the Semantic Web technology, the theory
of ontology engineering, a summary of the CO-ODE project and the Protégé system
as parent platform of this knowledge elicitation plug-in. In general, this chapter
provides the theoretical foundations in the domain of Semantic Web for this thesis.
3.1 The Semantic Web Overview Firstly introduced by Tim Berners-Lee (Berners-Lee, 1996), the Semantic Web
technology now is one of the most active research topics during the recent years. It
combines the strength of knowledge engineering community and Web community
to discovery the new technologies for various research domains, such as
Knowledge System Design, Web Service, Grid Computing, e-Science, e-Commerce
and so on. Tim Berners-Lee and his colleagues have stated a clear definition: “The
Semantic Web is an extension of the current web in which information is given
well-defined meaning, better enabling computers and people to work in
cooperation” (Berners-Lee, Hendler & Lassila, 2001).
Figure 3. 1: The Classic Cake Diagram
Basically the Semantic Web technologies aim to link multiple source of information
to help people to easily access and make information machine readable. The large
amount of information on the Web makes people difficult to locate their targets
whereas the information has been published on the Web. Although search engines
can help people in this task but almost everyone has experienced finding topics in
tens of thousands search results, which is tough. XML technology (Bray, Paoli,
27
Sperberg-McQueen & Maler, 2000) are employed so that machine can directly
process the large scale information data sets and make them reusable. But
unfortunately, neither the traditional HTML nor the XML documents can provide
semantically annotated information which is required in locating resource with
specific demands.
One the one hand, we need the extensible feature of XML because it will be useful
for future potential extensions. On the other hand, we want to add semantic
information into the HTML documents with XML syntax, thereafter the
information is unambiguously defined and identified on the World Wide Web and
interconnected with semantic relationships. The source of the information in the
Semantic Web technology is defined as URI (Uniform Resource Identifier) for the
purpose of being unique and linked (Berners-Lee, Fielding, Irvine & Masinter,
1998). A URI is a string for identifying an abstract or physical resource on the web
with traditional URL-like format.
In order to define the relationships between the resources on the Web, we need a
new syntax-based document to exchange machine readable information on the Web
by providing the functionality of machine understandable statement so that the
machines are interoperable. The triple structure is therefore introduced here to
implement the Resource Description Frame work, and we use RDF as the extension
file name (Lassila & Swick, 1999). This triple structure is natural language
expression format which consists of a subject, predicate and object. An example of
RDF expression is:
<http://yimin.wang.cn> <http://www.family.com/schema/isSonOf> <http://shuhua.chen.cn>
Yimin Wang here is a person as the subject, is son of Shuhua Chen, which
“isSonOf” is the predicate and Shuhua Chen is the object. RDF statement like this
enables people directly locate the resource using URI published on WWW, and can
be reused by others. As aforementioned, while the RDF statements are encoded
with XML syntax, it will be machine readable and interchangeable. The remaining
problem here is that the RDF statements are semantically free and while computers
are performing searching tasks on the WWW, they don’t understand Yimin Wang is
28
a human being but not a car or other things, although a knowledgeable person will
understand this. In order to make machine understandable of the expression we
have made, ontologies are applied to explain the properties of the relationships.
3.2 Ontology and Ontology Engineering The plug-in described in this thesis aims to provide the first step to build ontologies
- it elicits the knowledge from domain experts who will consequently structure the
knowledge. This section will basically introduce the development of ontology
theories and the ontology engineering techniques.
3.2.1 Introduction The term “ontology” is borrowed from the Philosophy discipline and extended to
the Semantic Web field as the knowledge base. In the domain of the Semantic Web
research area, ontology is described as an explicit formal description of
conceptualisation (Gruber, 1993), so it is reasonable that people may have totally
different descriptions and explanations for a same object they want to describe,
based on their culture backgrounds, knowledge level, conceptual model, cognitive
methods and many other aspects. Ontologies, therefore, are difficult to be built.
There are many different ways of representing the ontologies, including
syntax-based ontology language and UML diagram. Many research groups has
developed several different ontology languages: RDFS (Brickley & Guha, 2002);
DAML+OIL (Conolly, Harmelen, Horrocks, McGuinness, Patel-Schneider & Stein,
2001), OWL (Patel-Schneider, Horrocks & Harmelen, 2002) and KAON developed
by AIFB in Karlsruhe, in which DAML+OIL and KAON are extensions of RDFS
while OWL extends the DAML+OIL. The UML diagram presentation of ontologies
is not widely employed because of its limitation as diagrams, which is difficult to
be processed by machine and represented in large scale, although easy to be
understood by human beings.
By using ontologies, it is no longer difficult to semantically describe the example
mentioned in the section introducing RDF, defining the relationships between the
subjects and objects. After building the ontology including the information from
29
RDF statement, the contents of syntax-based document are:
<?xml version=“1.0” ?>
- <rdf:RDF xmlns:rdf=“http://www.w3.org/1999/02/22-rdf-syntax-ns#”
xmlns:xsd=“http://www.w3.org/2001/XMLSchema#”
xmlns:rdfs=“http://www.w3.org/2000/01/rdf-schema#”
xmlns:owl=“http://www.w3.org/2002/07/owl#”
xmlns=“http://www.owl-ontologies.com/unnamed.owl#”
xml:base=“http://www.owl-ontologies.com/unnamed.owl”>
<owl:Ontology rdf:about=““ />
<owl:Class rdf:ID=“Shuhua Chen” />
- <owl:Class rdf:ID=“Yimin Wang”>
<rdfs:subClassOf rdf:resource=“http://www.w3.org/2002/07/owl#Thing” />
- <rdfs:subClassOf>
- <owl:Restriction>
- <owl:onProperty>
<owl:ObjectProperty rdf:ID=“isSonOf” />
</owl:onProperty>
<owl:someValuesFrom rdf:resource=“#Shuhua Chen” />
</owl:Restriction>
</rdfs:subClassOf>
</owl:Class>
</rdf:RDF>
From encapsulated structure of this document, it is easy to discover the syntax level
relationships between XML, RDF, RDFS and OWL.
3.2.2 Ontology Engineering How to build ontologies with efficiency and reusability is one of the major concerns
in the domain of ontology engineering. As one of the most popular ontology editor,
Protégé are widely used by researchers to build ontologies and their experience
show that manually building ontologies is quite a labour-intensive work. The
Protégé users need to input and edit the concepts in ontology one by one, including
typing the names, editing the annotation, choosing the different properties and
defining the restrictions. If there are thousands of concepts in one task, this work
will be extremely time consuming. Researchers obviously don’t want to spend their
time on this repetitive, non-innovative work.
Actually, not many general approaches have been invented for building ontologies
and few of them have been sufficiently proved to be domain-free. The published
methodologies are mainly general frameworks with abstract descriptions and
outlines without a detailed guideline of how to build ontologies (Fernandez,
30
Gomez-Perez, Pazos Sierra, 1999), thus thereafter many ontology engineering
projects has been launched to find out a proper way to build ontologies.
Ideally the goal of ontology engineering is to enable the machine to build a certain
amount of the ontologies, but even in this case, human beings are required to create
ontologies by hand, so what they need is a more efficient approach to achieve their
demands. The card sorting tool in this knowledge elicitation plug-in gives an
opportunity to create ontologies graphically, by firstly initialising the ontology
structure and basic relationship between concepts. Thereafter, people are able to
save the raw ontology and load it into a broader ontology editor like Protégé for
future development. This tool releases people from the most intensive work in the
process of ontology engineering - modelling the basic structure of ontologies.
3.3 The CO-ODE Project Collaborative Open Ontology Development Environment, CO-ODE is a two year
project which is focused on developing tools of ontology building (Rector, 2002).
The knowledge elicitation plug-in described in this thesis is also part of this project.
The aims and objectives of this project are, in short, to provide an enhanced
ontology development and knowledge acquisition environment for domain experts
and to integrate other research outcomes, e.g. the AKT project, into the existing
popular tool like Protégé, using User participant design techniques. As a part of
the CO-ODE project, this knowledge elicitation plug-in undoubtedly share the same
design principles with the CO-ODE project, which include putting user cooperation
in an essential position while developing the tools in existing toolkits.
The research outcome of CO-ODE project is a range of Protégé plug-ins includes
(Drummond, 2005):
OWLViz - A Plug-in for graphical and structural view of the ontology.
31
Figure 3. 2: OWLViz Snapshot
Protégé Wizards - A Plug-in with several basic wizards to automate some of the
class creation process.
Figure 3. 3: Protégé Wizards Snapshot
OWLDoc - A Plug-in generates JavaDoc style HTML docs for your OWL ontology,
32
which can be used to have a straightforward view of the ontology structure.
Figure 3. 4: OWLDoc Snapshot
The Manchester Pizza Finder is an interesting application, with user-friendly
interface which uses pizza ontology and the RACER inference system to query the
valid pizza types. It is a good start point for the beginners who don’t have much
knowledge about how the classifier and the ontology work together to implement
semantic query.
33
Figure 3. 5: The Manchester Pizza Finder Snapshot
They are just part of the plug-in set within the CO-ODE project with more
emphasis on ontology engineering based on the view of knowledge elicitation
techniques. Now the CO-ODE project has a one year extension and is expected to
publish more applications with refinements, including the plug-in introduced in this
thesis.
3.4 Protégé Protégé is an ontology editor and knowledge acquisition tool mainly developed by
Medical Informatics group of Stanford University. Meanwhile, Protégé is a
community work, and a number of outstanding research groups around the world
have contributed over 70 plug-ins, including the Medical Informatics group in
University of Manchester, where the tools in thesis is being developed.
Protégé allows users to create ontologies and edit the data entry forms for data input.
The Graphical User Interface (GUI) of Protégé is well-designed and being
improved along with the release of the updated versions. The example of editing
“Yimin Wang is the son of Shuhua Chen” statement and saving it into ontology in
34
Protégé 2000, 3.1 Beta, is like the following screenshot. Through this example, we
may find that the layout, working procedure and outcome of the Protégé user
interface are quite straightforward and easy to be identified.
Figure 3. 6: An Example of OWL Syntax
The Protégé has a good extendibility so that the researchers can develop their own
tools to extend the functionalities of Protégé system, and then integrate their tools
into Protégé system easily and seamlessly. In terms of the productivity and
compatibility with the existing Protégé system, Protégé use Java (Sun
Microsystems, Inc., 2005) platform as a unified development environment,
herewith cross operation system developers could generate and test their codes
smoothly. The plug-in development for Protégé is using the Protégé 2000
application programming interface (API) (Musen, Fergerson, Grosso, Noy, Crubezy
& Gennari, 2000).
To check the satisfiability of the ontologies, it is crucial to link to the reasoners
from Protégé user interface. Protégé 2000 also supports reasoners such as FaCT
(Horrocks, 1998) and RACER (Haarslev & Möller 2001).
Protégé now is a well-established toolkit and expected to be continually developed
collaboratively within the Protégé community.
35
4. User-centered Design This chapter aims to generate a series of principles for designing the plug-in with
strong emphasis in the user interface design approaches. We plan to employ
user-participant design techniques to achieve this demanding goal. Thus, by given a
brief introduction of user-centered design methods and principles, the guidelines for
designing this plug-in will be brought forward. Consequently, we will discuss the
user interview motivations, procedures and results, which are expected to play an
important role in the future development.
4.1 Introduction to User-centered Design In the Human-Computer Interaction (HCI) research field, user-centered design, also
known as usability engineering, is one of the most essential methodology which is
now widely used in various disciplines, including Software Engineering,
Knowledge Management, Information System and so on (Norman & Draper, 1986;
Shneiderman, 1998). Otherwise, one of the aims and objectives of CO-ODE project
is to provide a user-oriented toolset (Rector, 2002), so the user-centered design
techniques will be kept in mind throughout the entire plug-in design life cycle.
The importance of usability engineering has been repeatedly stated by outlining
principles or case studying from difference research agencies and groups. The
NASA usability engineering team (2002) listed “10 Great Reasons to do Usability”,
which are very sensible and interesting. Generally, they think it could make
developers look smart and professional, users more productive and happy; it saves
the development cycle time, money, maintenance effort and support resource; and
finally, it gives you a better sleep. It is a list of casual reasons, whereas probably
with redundancy. A famous case is the IBM website example, (Tedeschi, 1999) in
which shows that the most frequently used function is “search”, because users
cannot locate the target resource while they are navigating the IBM website. And
the second place belongs to the “help” link - obviously people want to get some
help after their have failed in searching the information. After a ten-week project to
redesign the IBM website, although the costs is over million dollars, the help link
decreased 84% click times and the web-based sales amount is 4 times increased.
36
The user-centered design techniques include a range of procedures, guidelines and
software tools which are used to help researchers and developers in determining the
system design matters. The importance of the user-centered design techniques is to
assist the developer in assuring that their relevant design activities are considered to
be a user-oriented manner (Rauterberg 2003). There are three main categories of
principles to support user-centered design: learnability, flexibility and robustness
(Dix, Finlay, Abowd & Beale, 2004). Learnability focuses on the design
performance when the users initialise the use of the system at the first time.
Flexibility concerns with the various means of users and system exchanging
information. Robustness is clear to have the steady, dependable and fault-tolerance
system running environment. All those principles refer to a high standard design
procedure and the use of various design techniques.
User participant design is to make users involved in the software design process, by
interviewing various groups of users based on certain requirements, such as age,
occupation, gender, culture and so on. The interview result will be gathered and
analysed in order to discover the goals and values of the target user group. The
techniques of user participant design are obligatory while designing this knowledge
elicitation plug-in, because the target user group is mainly scientific researchers
with different disciplines, requirements, personal preferences, ways of working and
thinking.
The User participant design includes a sort of manual activities, such as using the
paper as window frames; cutting the paper into rectangles with difference size as
dialogues and menus; choosing difference colours as different selection feedback;
drawing, dragging while necessary to modify the interface; taking the picture while
performing activities and many other actions. All these are performed by the real
users. The photo below is made by this thesis author and taken from the CS617
Interactive System Design learning module in University of Manchester, taught by
Dr. Mark van Harmelen.
37
Figure 4. 1: An Example of User Participant Design Activities
4.2 Plug-in Design Principles While developing the plug-in of an existing system, including the design of system
structure and user interface, it is essential to outline a list of design principles. In
this system, based on the requirements and features as a knowledge elicitation tool
and a Protégé plug-in, this plug-in should:
have simple and tidy interface
This plug-in is user-oriented software but not a program running at the
background, so a well-designed and intuitionistic user interface is highly
required, making users familiarise the software fast and easily.
have flexible operating options
According to the book written by Dix and his colleagues (2004), as one of the
important categories of guidance for usability engineering, flexibility is
considered to be a core design issue. Providing versatile means of operating
options will make users easily adapt to the workflow of this software, and they
38
can find their preferred way of performing tasks.
make user participate in the design procedure
Both the card sorting and laddering methods are traditionally performed
manually, so it is sensible to hold some interviews to find out the ways of
people sorting cards and doing laddering, which will be the substantial sources
of experience while monitoring those activities on machine. This knowledge
elicitation plug-in aims to elicit knowledge following different disciplines,
environments and use cases, thus the interviewees are required to be diverse in
fields of expertise, culture backgrounds and manners.
use existing APIs (Application Program Interface)
Java has a strong extensibility and many existing APIs can be directly used by
developers to speed up their development. Java API doesn’t consist the
methods for processing RDF and OWL syntax-based documents, but many
other APIs, like Jena (McBride, 2002) API and Protégé OWL Plug-in
(Knublauch, Musen & Rector, 2004) API, provide many options to handle web
documents with RDF and OWL syntax. When users need to output the runtime
status into RDF or OWL file, the program could load those APIs to complete
this task.
work compatibly with Protégé
As a plug-in for Protégé, this software should certainly work well with the
Protégé system. Concerning the compatibility of other important plug-ins such
as Protégé OWL plug-in, it is ideal to share some tab-widgets with this
software, though it requires the collaboration of the developers from different
groups. Obviously it will enhance the overall performance for this knowledge
elicitation plug-in.
be extensible for future development
Because obviously, besides card sorting and laddering, there are many other
methods in the field of knowledge elicitation, this plug-in is reasonable to have
the extensibility to have another tools integrated, like repertory grid tool,
diagram tool, matrix tool and other tools mentioned in Chapter 2. Thanks to
39
Java’s flexible and strong extensibility, it is easy to integrate other independent
Java programs into an existing system. So what need to do is just designing a
suitable interface layout to arrange the location for the extra tools.
4.3 Interview for User Participant Design It is necessary to set a predefined series of interviews and invite potential users to
participate in these interviews in order to collect the design clues. Some interview
methods such as unstructured interview and structured interview should be
employed for different purposes.
4.3.1 Unstructured Interview Design As unstructured interview usually tends to be used in early stages of the interview
session, in which the users will be asked some general unprepared questions.
The unstructured interview doesn’t require any prepared question, thus the design
of unstructured interview highly relies on the interviewer’s personal communication
and facilitation skills. It is therefore important to make it clear that the interviewer
should try to focus on the topics related to the users’ general impression of this
plug-in and to facilitate the interviewees to provide some key points for their
thoughts.
In this project, at first, we need to know the users’ general ideas and points of view
about both the card sorting and laddering tool, the user’s attitudes towards the
perspectives of this plug-in and probably, and their personal manners of using
computer software. Unstructured interview results will provide the developer with
appropriate concepts and ways of thinking, rather than the technical details of the
software.
4.3.2 Structured Interview Design Comparatively, it is much easier for the interviewer to hold an interview with a sort
of predefined questions. The structured interview design is more important for the
software designer because all the interviewees will be asked a same set of questions
related to the software technical details. The analysis of the structured interview
40
will be crucial since the detailed technical issues in software design phase will be
settled down mainly based on the analysed result. In this project, the questions
listed below are defined and asked.
a) Do you know card sorting method before? (If not, the interviewer will give
users some background knowledge about card sorting.)
This question aims to give the interviewee a general idea about the card sorting
method if he doesn’t have any experience before. In fact, it is best to have a
number of knowledge engineering research experts to take this interview.
b) How do you think the automated card sorting method will be?
This is one of the core questions designed to discover the user’s first
impression on the computer-based card sorting method.
c) Here are some cards with different concepts, so could you please sort them in
your own way?
The users’ manners are different, and their intuitive activities decide their way
of using software. Making the way of using card sorting tool fit in well with the
manners of majority is essential.
d) Could you please sort it again by groups?
When we want to build ontologies, the major job of card sorting method is to
group the cards in to different piles and name the pile with a new concept. To
get the users’ way of performing this task is a guideline to implement this
functionality.
e) If the cards are put in one limited area of the desk, where will you put the
sorted piles?
This question is related to the layout of the user interface which is the
arrangement for the positions of each component.
f) What the colour of card is the best for you?
Basically, the colour of the card should not be very bright or dark, and it is
reasonable to normalise the colour layout based on the difference users’ favour
41
and the cognitive methods.
g) Do you know laddering before? (If not, the interviewer will give the users some
background knowledge about laddering technique.) And how do you think the
laddering method will be?
These two questions are designed for the same reason of card sorting related
questions.
h) Some of the cards now are sorted but some not, how will you related the cards
with difference relationships?
This question is asked for capturing how users add existing card items to the
ladder.
i) Do you want to see the card sorting result and laddering result simultaneously
on the desk?
There is a trade-off between the simple user interface and the flexible ease of
use principle, so let the user decide.
j) How do you think about the output?
This question will get the users’ prospects about software output.
After the completion of the interviews with a group of people, the results are being
collected and processed. The details of the interview results will be presented in
Appendix C.
4.4 Analysis and Conclusion From the interview results listed in Appendix C, it is not difficult to find out some
valuable points. We are concerning about the user’s general viewpoints of this
knowledge elicitation tool that have been acquired from unstructured interviews,
and also several detailed technical aspects which have been asked in the structured
interviews. Hereby we will analyse the results of unstructured and structured
interview respectively.
42
4.4.1 Unstructured Interview In terms of the unstructured interview, we can conclude that the majority of the
interviewees have the common perspectives listed below.
For the whole software system, the system should:
have straightforward and simple user interface
be ease of use
give multiple options to users
have a unified output
be easy to manage the task
For the card sorting tool, the tool should
have a range of cards with shape of credit card
make people understand the sorting mechanism
For the laddering tool, the tool should
shape as a real ladder structure
explain to users what the ladder is
It is unpractical to collect the detailed software perspectives from users at this stage,
and the results of unstructured interviews just only provide the software developer
with some general concepts and guidelines. However, those original and raw ideas
from real users are not neglectable.
4.4.2 Structured Interview The structured interview result could be concluded by listing the questions in
section 4.3.2.
a) Do you know card sorting method before?
From the results, we find most people who haven’t been involved in the
knowledge management research domain nearly know nothing about the card
sorting.
b) How do you think the automated card sorting method will be?
43
By given some backgrounds of card sorting method, the amateur people tend to
imagine the cards are a pile of paper, metal or plastic pieces with size of poker,
credit cards or name cards, which is a five centimetres wide, eight centimetres
long, round rectangle. And then they often choose to sort the cards like playing
poker. On the other hand, the experts in the knowledge management domain
like to see the paper pieces with concepts written on them and sort them into
different piles, then name the piles.
c) Here are some cards with different concepts, so could you please sort them in
your own way?
The knowledge management experts lack of imagination at this time because
their thoughts are limited within the formal routine of card sorting method.
Their own way of sorting cards are disorderly and unsystematic, and actually
sometimes they are totally confused of what they are doing. The other way
round, non-experts are likely to have a common sense that cards should be
sorted with piles in which the cards shares some similarities in first instance,
after that they may consider to sort the cards by different relationships.
d) Could you please sort it again by groups?
Our experts are very willing to do this and they can complete this job fast and
correctly, while the amateurs are not proficient in doing his. But the major
concern in this question is to record down the tracks of sorting the cards and
the manners the respondents are likely to have, rather than the sorting results.
Obviously, all the people like to drag and drop and cards into piles, or
alternatively, catch the cards in hand and put them into groups. Those two
actions share approximately the same quantity.
e) If the cards are put in one limited area of the desk, where will you put the
sorted piles?
Concerning the layout of this software, most people like to put the sorted cards
on top of or at the left of the original area where cards located.
f) What the colour of card is the best for you?
Basically, this question is not well designed because all the people involved in
44
the interviews tend to choose their favourite colours which are almost totally
different.
g) Do you know laddering before?
This question has the same result as the card sorting problem, so the interview
question could be reduced to “Do you know card sorting and laddering
before?”
h) How do you think the laddering method will be?
The knowledge management community people certainly have the similar
understanding about laddering technique, while people in other areas will
intuitively think about the real ladder which is only going up and down, rather
than going sideways.
i) Some of the cards now are sorted but some not, how will you related the cards
with difference relationships?
Most people like to write the relationships on the back of cards, whereas few
people use extra cards with different colours to demonstrate different
relationships.
j) Do you want to see the card sorting result and laddering result simultaneously
on the desk?
Unexpectedly, all the people have the same answer – YES, to this question,
because maybe people are more likely to see the source and the destination at
the same time while they doing laddering.
k) How do you think about the output?
The experts want the output well-formatted and can be reused by other
programs, but the non-experts don’t have much idea about this.
4.5 Conclusion The result shows a remarkable difference between people with difference academic
backgrounds, however, it also tells that the age, gender, and cultural background
45
don’t play essential roles in the interviews. Probably, that’s because of the
statistical analysis requires a much larger sample which cannot be provided in this
series of interviews due to the limitation of the size and costs of M.Sc. project, but
fortunately we have collected enough expected information required by the design
of this knowledge elicitation plug-in.
In next chapter, according to the results from the interviews, this thesis will give a
detailed technical design of the plug-in.
46
5. Implementation of Knowledge Elicitation Plug-in As pointed in Chapter 1, while people are eliciting knowledge from various source
of information, they often suffer from a sort of labour-intensive knowledge
elicitation techniques, which are fragile and difficult to be managed. Thus there is
an opportunity to automate the knowledge elicitation techniques, such as card
sorting and laddering, so that people are released from the tradition,
time-consuming tasks by performing the elicitation sessions on machine.
This chapter describes the design and implementation of a tool for knowledge
elicitation. It includes the scope, requirements, design issues, structure and
implementation of the software system. The section of “Plug-in of Protégé”
describes the technical factors for embedding this software into the Protégé
platform and other related issues.
5.1 Scope and Requirements This knowledge elicitation software system aims to reduce the work-load for
knowledge engineers and domain experts; increase the reusability of laddering and
card sorting processes; effectively manage the knowledge elicitation tasks; and
seamlessly build with existing software system.
As mentioned in Chapter 2, experts are usually busy and their time is valuable, but
in the mean time, knowledge elicitation tasks which involves many kinds of
interviews and takes large amounts of time to perform. Thus the system developers
have to face a trade-off between the quality of interview process and the cost of
inviting domain experts. Traditional card sorting method requires many cards made
from paper, and usually it could be hundreds of cards. Sorting of hundreds of cards
might take a couple of hours, and imagine if the window is open, and there is a blast
of wind comes, everything will be damaged in one second. Not only wind may
cause the accident, but a cup of water, or even an unskilled internship student could
also do this. In all, the tradition means of knowledge elicitation methods is tough
and fragile.
If we transplant the task into computer, we are no longer worrying about the issues
47
above. You can save the task in the permanent storage devices and have as many
back-up copies as you like, and of course, moving hundreds of cards in computer is
obviously much easier than doing such a paper-based, dazzling task.
The second problem for manual knowledge elicitation methods is they are
extremely difficult to be reused and tracked back. While people are performing
laddering method based on the result of card sorting, to find their real demanding
goals and true values, they have to record down the results of card sorting as well as
their real-time thinking. But when they finish doing this, it is not possible to
maintain the structure of the card sorting by leaving the cards on the table
permanently. We should collect the cards but we immediately find out there is a big
problem to track back to the previous activities. We could record it on paper,
however it will become another heavy task to put down everything, including the
structure, ways of thinking, comments and annotations, on the paper in a reasonable
mean. Otherwise, video tape recording has already been rejected in previous
chapters, so we have to find out a more beneficial approach.
Redo and undo mechanisms in text editor give us a hint to solve this problem by
automating the task in computer. The whole procedure and all its related matters -
we call it transaction - will be temporary stored in the memory and saved to the
permanent storage devices if necessary. By doing this, the domains experts and
developers are able to go back to anywhere if they want, all they need to do is to
store the transactions (one step of the card sorting process) while they are standing
at a milestone.
Normally, people organise the card sorting and laddering results by documenting or
filing them to folders (both the physical folders and virtual folders in computer). It
is time-consuming to locate the exact folder and requires independent mechanism to
support the resource manager, such as Windows Explorer (formerly Windows
Resource Manager). Therefore, a build-in ladder and transaction manager for this
knowledge elicitation software system is necessarily to be developed with
principles of usability engineering. Software users could easily manage their
transactions and ladders graphically by a well-established Graphical User Interface
(GUI), and they can add, select, edit or delete a transaction or ladder.
48
Forth, as part of the ongoing project CO-ODE, it is essential to build the software
system in an existing popular ontology editor and knowledge acquisition system -
Protégé. People are familiar with a widely used toolkit rather than a brand new
software system, so they will spend much less time on training of using the new
system. With respect to the Protégé toolkit, which has been introduced in Chapter 3,
hence we can pay more attention to the technical implementation of the plug-in
building problems.
5.2 Designing Issues and Structure About the modern software engineering design patterns (Gamma, Helm, Johnson &
Vlissides 1995), the life cycle of this project should include requirements analysis,
system design, implementation, testing and evaluation. This section does not
involve the implementation, testing and evaluation phase but it focuses more on the
designing issues and provides an overall structure of the system. The
implementation will be introduced in the fifth section of this chapter, and in
Chapter 6 there will be detailed user-testing and evaluation.
5.2.1 Case Study of Interviews Learning from the analysis of interview results, some detailed design basics and
principles can be found, which are also based on the design principles listed in
Chapter 4, theoretical backgrounds in Chapter 2 and 3.
This software system should have 1) a input from document and terms elicitation
functionality from user interface; 2) a series of cards generated from the terms with
round rectangle and the colour style of Protégé, that’s because the users are not able
to meet an agreement on the colour, using the colour style of an existing popular
base system, like Protégé, tends to be sensible; 3) a flexible, simple and
straightforward user interface with layout of placing the working panel - both the
card sorting and laddering tool, at the left as tabbed widgets, and putting the
operation result on the right, as well as a number of buttons reasonably arranged; 4)
a well-formatted output.
49
In order to implement these design principles, a requirements analysis will be
employed, using requirements engineering techniques (Sommerville & Sawyer,
1997).
5.2.2 Requirement Analysis Requirement engineering is an important branch of Software Engineering. Software
developers identify the needs or requirements of users and find out the possible
solutions to the proposed problems. The scope and aims of this project has been
defined in the first section of this chapter, now we should firstly list out all possible
functions and concerns of this knowledge elicitation tool. Brain storming is
performed at this stage, but the feasibility, costs, time or any other practical matters
are not considered here.
Transaction related functions
record detailed user information
save and track transactions freely
transaction and ladder manager
User interface related functions
put a straightforward GUI for arranging cards
enable multi-tab allocation for large set of cards
user-friendly GUI
plug-in of Protégé
Input related functions
get terms from multi-format documents
get terms from formatted documents
automatically processing the terms
speech/Scanning input
Card sorting functions
generate cards
edit cards with multi-source information
enable user defined cards
50
sort cards by user
ontology-based automatic card sorting
Laddering and output related functions
formatted file output
laddering by user
user defined laddering relationship
The second step is to define categories for the tasks listed above, the tasks are
divided into 4 categories - Tasks MUST, SHOULD, MAY or CANNOT be
implemented, and it depends on the consideration of feasibility, costs on time and
money, technical difficulty, and so on. The functionalities are categorised by
analysing the user’s preference and developer’s implementing priority.
Along with the classification of tasks, the analysis of the task will also be listed.
MUST tasks:
get terms from formatted document
The input information must be a set of terms which are used for sorting, and the
formatted file is the best for machine processing.
record detailed user information
In terms of the domain experts’ different personalities and culture backgrounds,
they might perform card sorting and laddering activities in dissimilar ways.
Therefore it is important to record down the detailed user information of each
transaction which will be analyzed in the future in a high probability.
generate cards
Card sorting is a fundamental functionality in this software system, so it is
mandatory to set this task to “MUST”.
edit cards with multi-source information
Like annotations for ontologies, domains experts normally are willing to write
down their ways of thinking during the process of card sorting. This mechanism
51
enables users relating their thoughts with cards easily.
sort cards by user
It is another basic requirement of this project which have to allow user sort cards by
themselves.
enable user defined cards
Besides the cards generated from input file, users often want to create their own
cards.
save and track transactions freely
As the traditional knowledge elicitation tasks are difficult to save the process and
track back to previous activities freely, this project aims to provide this function by
recording the transaction activities in a output file.
laddering by user
As one of the core tools in this plug-in, laddering by user must be implemented.
transaction and ladder manager
Users manage the transactions and ladders by performing tasks listed below by
selecting them; add a new transaction or ladder; edit an existing transaction or
ladder; delete a transaction or ladder.
SHOULD tasks:
put a straightforward GUI for arranging cards
This function will make people more effective to sorting cards than a normal GUI.
user defined laddering relationship
Other than pre-defined laddering relationship, users usually want to create a
personalized ladder relationship.
formatted file output
It is smooth for other programs read the output information of this plug-in if the
output file is formatted in a good manner. The details of the format will be
52
discusses in the fifth part of this section.
user-friendly GUI
Considering the usability matters, people are working efficiently and smoothly with
a user-friendly GUI.
plug-in of Protégé
If the project is built seamlessly as a Protégé plug-in, it is a great advantage for its
future development. Protégé is widely used so this plug-in will have more testing
cases to help in future extending.
MAY tasks:
get terms from multi-format documents
This function enables people to get the input information from difference document
formats, whereas it costs much development time on integrating the file format
processors to this program.
speech/scanning input
It is an ideal input way with ease of use and best for interview, however the
technical problem is that the speech/image processing will be a huge module and
not very practical for MSc project. It might be developed in a light version but it
depends on the progress of this project.
enable multi-tab allocation for large set of cards
While people are dealing with large set of cards, they will find the graphical
interfaces is too small to display all the cards. Multi-tab allocation for displaying is
a helpful function, although it does increase the complexity of implementation.
ontology-based automatic card sorting
From existing knowledge base users may get some sorting information and
grouping method, so users are able to perform automatic card sorting. It might be
useful while the ontologies are will-established, nevertheless currently, not much
ontology can meet this requirement.
53
CANNOT tasks
automatically processing the terms
Deploying this component is ideally quite efficient in daily work, but due to the
focus of this thesis is not related to the Natural Language Processing (NLP)
techniques, we are not going to implement this function. Furthermore, the current
NLP techniques don’t meet the project requirements on performance and accuracy -
they are usually both CPU and memory-intensive.
5.2.3 System Structure Design The system structure diagram shows the basic structure of this software. Typical
ways of path are listed below.
Start [Set of Terms] - card sorting and/or laddering- Output
Start - Term Extraction [Set of Terms] - card sorting and/or laddering - Output
Start - Term Extraction [Set of Terms] - card sorting and/or laddering - Relationship
Building – Output
The first path is that the system starts with a set of terms available in required
format document for eliciting knowledge and users directly load the document into
the software. The second process is to do some term extraction task manually by
several mouse-clicking actions and then perform card sorting and laddering
methods. The third routine enables users to set relationship among terms and finally
get the output.
In the Figure 5.1, we can see the dash lines split the figure into three parts: the
starting processes, the task-performing actions and the output session. They are
comparatively independent processes and management by the transaction manager
in the broader context.
54
Figure 5. 1: System Structure Diagram
Normally, the result of card sorting is the input of laddering process but ladder
could directly get input from existing file if necessary. Building relationship is also
not mandatory in the whole system.
5.2.4 Software Development Platform Three choices are available for the development platform of this plug-in, C++,
Python and Java.
As traditional object oriented programming language, C++ has its advantages of
fast execution and ease of coding. Nevertheless C++ program often can not be
compiled and executed cross-operating system because of the limitation of
programming library interface.
55
Python is famous for its high productivity and it becomes more and more popular in
knowledge engineering domain (Python Patterns - Implementing Graphs, 2003). It
should be a good choice if the Protégé system is developed by Python. But
Python-based software may cause some compatibility problems while working as a
Java system’s plug-in, so it is not wise to develop a Protégé plug-in using Python.
In order to be coordinated with the parent system, the knowledge elicitation plug-in
for Protégé is proposed to be developed in Java platform, with purposes of
unification and compatibility.
5.2.5 Output Format The format of output is one of most important design issues because a primary
consideration of this plug-in system is extendibility, which emphasises of unified
input/output. This software system might have many possibilities of input, thus the
output format should be discussed here. Basically, the proposed output file formats
are:
ASCII/Pure Text
HTML
Developer-defined format with specific syntax
RDF/XML
The decision making principles are:
Feasibility - Program could efficiently parse it.
Portability - It should be easy to transfer over internet.
Acceptability - It should be widely accepted in different machines.
ASCII/Pure text is the most common way to store information, and the generated
file size is small and easy to be transferred. However, the pure text files may have
different default format while they are processed in difference operating system.
HTML is a well-defined syntax-based mark-up language and easy to be parsed. It is
the most widely used document format in today’s world, thus there is no
compatibility problem. Maybe it is a good idea to develop a specific format for the
output information in this project, so that the problems of pure text and HTML are
56
eliminated. Nevertheless the self-defined format will also cause portable problems
which break the rule of unification.
Resource Description Framework (RDF) is based on the URI and XML
technologies and it aims to describe the resource over World Wide Web. It is W3C
standards so certainly it is widely accepted. Because the initial purpose of RDF is
resource description, the semantically annotated information is well-described here.
Based on XML, RDF is machine readable and the processing of RDF is
well-implemented by some third party programming language APIs.
As matters stand, the most sensible choice is to use RDF format for information
storage in this software because of the consideration of feasibility, portability and
acceptability. Another possible output is to use the existing Protégé components
such as Protégé OWL Plug-in to directly transfer the output to the ontology tree for
future development.
5.3 Plug-in for Protégé This section introduces the design requirements and procedure of integrating a
piece of independent Java software into an existing software system, Protégé. Let’s
firstly look at the common technical requirements of being a plug-in of Protégé.
5.3.1 Common Technical Requirements Because Java is memory intensive, basically, the hardware requirements for a
desktop computer to develop Protégé plug-in are at least 512 Megabytes memory -
it is running tremendously low while the memory is 256 Megabytes. As the
software requirements, the major operating systems can be used as the system
platform if the Java Virtual Machine (JVM) can be executed, otherwise, a Protégé
system must be properly installed. In the program level, the software which will be
plugged in Protégé system has to be able to run independently.
5.3.2 Integration Method For the testing purposes, it is not reasonable to create a JAR archive to store the
Java classes and put it in the plug-in folder of Protégé installation path. In this
57
project, developers create a path in the Protégé plug-in path and set the Java
compiler directly put the classes into this path. Then a Java manifest file must be
edited to show the classes are running as the Protégé tab-widget, as well as the
name of class which will be executed via the “main” method.
In the programming level, the plug-in must import the Protégé library first and
rewrite the “Initialize” method in the main class, in which the plug-in software is
initialised and executed. Finally, a method call to the Protégé system must be
included in the “main” method. Thereafter, we can run this program and we can see
the Protégé has the plug-in integrated. Figure 5.2 is the example code of the “main”
method
Figure 5. 2: Sample Code of Using Protégé API
5.4 Implementation The details of implementation of the knowledge elicitation plug-in will be given in
this section, which includes the first step to build the concepts of implement, the
design of runtime classes and their corresponding UML diagram.
5.4.1 Implement Concepts In this knowledge elicitation tool, laddering and card sorting is two different
sub-tools, and they are divided into two tabs.
For laddering tool:
Users are able to select the relationships of ladders, or define their own relationship
and questions.
a) Use existing concepts/classes in the existing ladders to manipulate more
ladders by different relationships.
58
b) Define relationship to elicit explanations.
c) Define new relationship to elicit goals and values, e.g. hasGoal
For card sorting tool:
a) User defines the number of the piles and gives their names and specifications.
b) User should record their name down on it.
c) User saves the initialization of the piles/groups.
d) Knowledge elicitation
There is two alternative means for eliciting terms and concepts from documents.
Firstly, user opens a document to elicit knowledge.
a) Select words, right click to display menu and put each word into a pile. Make
notes/annotations if necessary.
b) The words selected are displayed as different colours.
Secondly, user adds concepts/terms into the pile, and makes notes/annotations if
necessary.
c) Open document and the terms defined will be displayed in different colours to
demonstrate their groups.
d) Make notes/annotations if necessary.
Finally, save the knowledge elicitation process and output knowledge elicitation
result as RDF file.
59
Figure 5. 3: A Diagram of Workflow Prototype
5.4.2 Software Java Classes This project consists of seven major classes that link with each other. The KE class
is the main class here and JTransaction class plays a major role in handling the
runtime operations and data. The detailed UML diagrams of primary classes and
descriptions are listed below.
60
a) JTransaction
The overall transaction management is provided by this class. The transaction
handling makes user easy to track back to their previous status and by simply
saving/loading and switching between transactions. This transaction-based system
managing mechanism is an essential feature of this project and can be thought as a
primitive ontology versioning system that enables users to manage their footsteps
while developing ontologies. It also can be extended to have more formats, for
example, OWL file, as output document to fit future requirements,
Figure 5. 4: Java Class JTransaction UML Diagram
61
b) KE
The main knowledge elicitation class has the most complex structure and functions.
It initialises the components of the interface and set action listeners for each activity,
which will cause future operations.
Figure 5. 5: Java Class KE UML Diagram
62
c) JCardSorting
This class defines a card instance which will be showed in the card sorting panel,
and its related operations.
Figure 5. 6: Java Class JCardSorting UML Diagram
63
d) JDocElicitation
The program uses this class to create a frame to enable users eliciting terms directly
from texts.
Figure 5. 7: Java Class JDocElicitation UML Diagram
64
e) JLaddering
The laddering procedure class is initialised and showed in the laddering tool panel,
including the basic properties of the ladder.
Figure 5. 8: Java Class JLadderingUML Diagram
65
5.5 Conclusion In this Chapter, the project of knowledge elicitation plug-in for Protégé is designed
and implemented using Java language. The system structure and prototype diagram
illustrate the basic functionalities of this plug-in with a black-box methodology,
while the UML diagrams using while-box methodology to show the inner structure
of this program (Pressman, 1997).
The plug-in now is running well with Protégé system, and the next phase will be the
user testing and evaluation of this software, by which we are able to see the quality
of works in this chapter.
66
6. Software Testing and User Evaluation After the implementing the software system, an essential stage of this project is to
test and evaluate the plug-in by employing usability engineering techniques, and
then the developer is able to refine its interface and functionalities. Another purpose
of user evaluation is to collect the comments on this plug-in by rating the quality of
this project, which can be used to partially prove the correctness of entire
development procedure and methodology.
The testing and evaluation phases are discussed in two separate sections after which
the analysis and conclusion will be given. The testing and evaluation are also taken
by both experts and non-experts in order to discover the feedbacks from the
different groups of users.
6.1 User Evaluation Methodology The testing process includes a brief guideline of plug-in installation and usage. In
the final version of this plug-in, most bugs reported have been fixed, and the
suggestions on the possible improvements have been carefully considered and
partly implemented while applicable.
6.1.1 Software Setup The initial setup is not straightforward because this system are being developed and
tested under a specific Java software developing environment, in which developers
are able to simply design software user interface by generating GUI forms.
Furthermore, many third-party Application Program Interfaces (APIs) are loaded to
minimise the programming task, thereby we also need to install those packages in
local paths before compilation.
The detailed software setup procedure is listed in the appendix.
6.1.2 Elicit Knowledge using card sorting All the tasks start from creating a new transaction, which will be discussed in detail
later. Considering the principles of flexibility, the plug-in provides three options for
users creating cards:
67
a) Document elicitation
By clicking the “Document Elicitation” button, the users can load documents from
pure text files with a system recognisable format. The users can select a term either
by a normal selecting action or a double clicking on the term, then the users can
either make it to a class appeared in the card sorting panel, or in a tree storing group
on the left bottom side of the main frame window.
Figure 6. 1: The Document Elicitation Frame
b) Loading formatted pure text
The second option is to load a well formatted pure text that the text is a set of terms
which are separated by a space or enter key. This format restriction is predefined by
the system and can be modified to meet other requirements.
c) Creating new card directly
This option is implemented by clicking the “New” button in the card sorting
tab-widget. A new card with a system generated name will be created for the future
editing. It is flexible when users find it necessary to add some new cards to this
68
transaction.
Accordingly, there are also a number of different options to group cards into
different piles. The users can select one or more cards and click “Add to Group”
button, the cards selected will be put into the current selected group in the tree laid
on left bottom. We can also “Make Group” instead of “Adding to Group”, and in
this case, a new group with a system defined name will be created under current
selected group with a set of cards. The “Add to Group” option is also provided in
the right mouse clicking menu. Further more, if the users want to see the contents of
a group, they can click the “Show” button at the top of the group tree and a new
tab-widget which contains the cards in the group will appear on the right side of the
card sorting and laddering tabs. The users can also perform actions on the cards in
the new created tab.
Ideally, the users are very willing to use drag-and-drop actions to move the cards
around, but unfortunately because of the limitation of programming timetable of
this M.Sc. project, this functionality is not applicable at this moment. It will also be
discussed in the Chapter 7 as a part of future work.
Figure 6. 2: Card Sorting Tool
69
6.1.3 Laddering the Result As mentioned in Chapter 2, the laddering and card sorting are usually coming
together. In the CO-ODE project in the Medical Information Group, we are
concerning more with the management of medical and biological terminology. The
users, hence, tend to use laddering tool to define the relationships and find out the
common concepts based upon the result of card sorting.
When we have a sort of cards grouped, we want to build a conceptual ladder, so we
can either click the “Add to Ladder” button in the card sorting tab or in the mouse
right clicking popup menu. Alternatively, we can add concepts in the grouping tree
to the ladder. But whatever the concept is, wherever it comes from, the first concept
added to the laddering tab will be the root of the ladder, and if the users don’t select
this ladder node, all the subsequent added concepts are treated as the new ladders.
This series of actions are similar to the grouping tree’s operations.
According to the comments from testers, the button of “Show Card” is deployed to
help user to see the card sorting and laddering tab simultaneously. In addition, the
user testing and evaluation result will be appended in the section of Appendix C.
Figure 6. 3: Card Sorting and Laddering Tool Appears Simultaneously
70
In most cases, the users want to move a whole group, or a node and all its sub-node
to the ladder. “Set Parent” function, located at the top of the grouping tree and the
right mouse clicking popup menu, is designed to do this task. A message box giving
the hints to the users will also display while clicking the “Set Parent” button.
The properties of a certain ladder can be edited by clicking the “Ladder Settings”
button and users are able to add and edit relationships of this ladder, which
demonstrate the relationship between the parent and children node of this laddering
tree. Here the ladder relationships’ attributes are established to show the
descriptions by clicking the “Ladder Hints” button, which gives information to
facilitate the user in building the ladder and could be extended to other
computer-aided ladder design functions, for instance, an automatic questioning
system.
Figure 6. 4: Laddering Tool
6.1.4 Using the transaction manager The transaction manager is a significant component in this project, and herewith
this transaction concept comes from the informal talk within the research group
(Wang, Rector, Stevens, 2005). This function is implemented by saving the runtime
status of the software, both the information of the laddering and grouping tree, as
well as the contents in card sorting tab, to the main memory, via a JTransaction
class instance. The depository of the trees’ structure are encoded and decoded by a
71
fresh Java API, XMLEncoder/Decoder, which are quite flexible in processing the
structure of Java objects.
We make use of this manager to organise the global actions performed by the users
so that users are able to track back to their previous software runtime status by
simply choosing and loading the transaction they created and saved before. The
users just need to choose a target transaction and press the “Save Status” button and
the system will give a message to tell the users whether the transaction is
successfully saved or not. Once the users come back from other transactions, and
select this transaction again, they can simply load this transaction into the working
tabs and trees immediately by pressing the “Load Status” button. This component
lays on the upper right side of the interface.
It is worth mentioning that the result of user testing and evaluation shows that most
people are approving this component very much.
6.2 User Evaluation User evaluation shows the users’ attitudes towards the quality of this software.
Based on the requirements of user-centered design, the feedbacks from user
evaluation will be treated as an essential guideline for software testing and
debugging procedure.
In this project, the user evaluation has two different parts. One is the interface
evaluation which concerns the plug-in’s GUI, including ease of use, look and feel,
and so on. And the other one is functional evaluation whose emphasises are the
background functionalities. This evaluation methodology aims to detect the users’
comments on two basic aspects in the domain of user-centered design - the software
should be powerful, flexible and robustness.
There are eleven people involved in the user evaluation activities, and they are
diverse in academic and cultural backgrounds. In order to quantify the result, a
grading system similar to the university examination will be borrowed, that is, 5 is a
pass, 6 is a good pass, and 7 is a distinction. In the arrays of the scores introduced
72
below, the first five scores in each array come form the experts or frequent users of
knowledge systems.
6.2.1 Interface Evaluation In terms of the user interface design, the grading result will be given to four
different aspects. The users are asked for the grades of the four points, and their
grading results are listed below. To be statistically accurate, the average score is
calculated by eliminating the highest and the lowest scores in each array.
Domain Experts (n = 6) Non-experts (n = 5) Avg.
Look and feel 9 7 7 8 9 7 7 6 7 8 6 7.3
Interface layout 7 7 9 6 8 6 9 5 6 5 6 7.3
Ease of use 7 7 6 7 8 7 6 6 6 6 5 6.4
Flexibility 7 8 8 6 5 6 6 6 9 6 8 6.8
The overall score is calculated by formula using standard deviation (Kenney &
Keeping, 1962), and we get 6.7 here.
6.2.2 Functional Evaluation The functional evaluation involves the grading of each basic component, including
card sorting, laddering, relationship setting and transaction manager. They are four
major components provided by this plug-in and users are easily getting familiar
with them, so the grading of these components is direct.
Domain Experts Non-experts Avg.
Card sorting 9 8 7 9 9 7 8 8 7 7 8 7.9
Laddering 7 6 7 7 7 7 6 7 7 9 8 7.0
Relationship setting 6 7 6 6 6 8 7 9 7 8 8 7.0
Transaction manager 9 9 8 8 9 8 7 8 9 8 9 8.4
We can see that the overall is: 7.6. After taking the scores from users, we can
analyse the result and make a conclusion for this user evaluation procedure.
73
6.3 Evaluation Result Analysis and Conclusion From the scores, we can simply find out that the users are mostly satisfied with the
functionalities of the plug-in, which stands that the primary user-centered design
procedure are well-established. With respect to the interface of this plug-in,
although the score is not such shining, the users also generously have given positive
comments.
To discover more from the evaluation results, we find that the interface look and
feel, card sorting and transaction management components have the highest ratings,
which are explicitly the best implemented. Meanwhile, the elements related to the
ease of use principle and interface layout arrangement require much future
improvement.
If we go further, we may find that the plug-in’s interface are more appreciated by
the experts rather than the amateurs, because the knowledge engineering experts are
more familiar with the existing Protégé system, card sorting and laddering
approaches. They find that this software have a unified style with the Protégé
system, which doesn’t quite make sense to the non-experts, though. Otherwise,
contrarily the experts are not fully satisfied with the laddering tool and relationship
setting component. Their feedbacks express the way of their working is somewhat
different from how this plug-in does. That’s because, as mentioned in the Chapter 2,
people from different disciplines are likely to use laddering tool in many different
ways for different purposes, and the plug-in is developed according to the design
principles of CO-ODE project with strong emphasises in the medial and biological
domain.
It is worthwhile to mention that staffs from the wholly independent UK Freshwater
Life Biological Association evaluated the plug-in, and their comment on this
software is:
“It was good to see what he has been doing and looks like a potentially very useful
tool. We’d really like to get our hands on a copy to play around with. Even in its
current state it could save us considerable time.” (McNicol, 2005)
74
In a nutshell, this plug-in are commonly considered to be a well-implemented and
powerful tool in real use, whereas probably, the interface is only recognised by the
knowledge system experts. All the evidences in user evaluation show that people
are very willing to see the future development of this plug-in.
75
7. Conclusion and Future Works 7.1 General Conclusion The explosion of information and data on today’s World Wide Web has left people
with the complicated tasks to organise and manage the web resource. Especially,
people find it difficult to locate the interdisciplinary information while the web
search engines are not able to identify the property of the information intelligently.
Therefore with the advent of the Semantic Web technology, ontology, which is the
term borrowed from Philosophy, is now widely applied as the knowledge base of
this technological frontier. Consequently, the study of ontology engineering, which
supports the building and managing ontologies, is undeniably important while
developing the Semantic Web applications. The first step in building ontologies is
usually to elicit knowledge from various sources of information, including pure text,
documents, voice and video. Using traditional knowledge elicitation techniques to
build the basic concept structures is usually time-consuming and labour-intensive,
thus the demands of a graphical-based tool as a Protégé plug-in to help in this
process are importunate.
In this project, a knowledge elicitation plug-in with card sorting and laddering tools
was built using user-centered design techniques to help users in eliciting knowledge
and output them into a document. It shows that the knowledge elicitation techniques
can be integrated in to Protégé. Potentially, this project can be well-linked to the
Protégé OWL Plug-in by providing the initial ontology structure tree. The project in
this thesis is part of the ongoing project - CO-ODE from the Medical Informatics
group in University of Manchester. The development of this project looks plausible
but remains to be tested through user evaluation process.
As another major aspect this the thesis, usability engineering methods have been
widely employed throughout the entire project, including the design process, testing
and user evaluation phase. The users have actively participated in designing, testing
and evaluating this plug-in about which they have given valuable feedbacks.
Due to the size-limitation of a Master project, the number of users participated in
this project is comparatively small, but there still have been eleven people from
76
various disciplines given comments on each stage of this project. This project not
only has collected the ideas from experts in knowledge management domain, but
many other people who are not IT specialists also have contributed a lot. Therefore,
I would expect the people involved in the user-centered design sessions are
representative enough to complete a basic design cycle.
On the one hand, the empirical evaluation from the knowledge management experts
has illustrated that the software has made a considerable progress to help them in
eliciting, building and structuring knowledge. On the other hand, the opinions from
non-IT people have revealed that this plug-in has a user-friendly look and feel and
is particularly learnable. While they are testing and evaluating this tool, many bugs
of this software are reported and fixed, which will also be appended.
In all, in terms of the requirements of this Master project and the results of the user
evaluation, this knowledge elicitation Plug-in for Protégé has been successfully
accomplished and mostly met the proposed software specification. In addition, it
has already attracted the real users’ attention and made them eager to see the public
released version of the plug-in to effectively facilitate their daily work.
This project also provides a example of which might be developed into a
framework which gives a routine to build knowledge elicitation tools within an
existing pluggable software system. Such a framework enables people to develop
similar knowledge elicitation tools, like repertory grid tool, diagram tool, matrix
and so on, within Protégé environment or other systems. It is also possible to
borrow the ideas of user-centred design principles learned from actual users in this
project to help the tools development within the framework.
7.2 Future Work There are four major aspects for the future development of the plug-in, including
the input of the terms, the output of this knowledge base, the interface and the
functionality.
First of all, there are many other means for terms input, such as automatically
77
extraction of terms from web documents based on the data mining and natural
language processing techniques, so the eliciting process will be much effective than
current pure text manual elicitation. This add-on requires in-depth knowledge in
upper listed research topics but fortunately a number of existing tools have been
developed (e.g. the AKT project), there is an opportunity to transplant those tools
into this plug-in.
Secondly, by cooperating with the Protégé OWL Plug-in term, this plug-in can
directly output the tree-like structure to the ontology structural tree in Protégé OWL
Plug-in, so that users can edit their card sorting and laddering results in the Protégé
OWL Plug-in and output it to document with RDF or OWL syntax smoothly.
The third potential development is to enhance the user interface of this plug-in. The
user evaluation results make out that the users are not fully satisfied with the
interface, so future improvements are required. Actions like drag-and-drop and
direct-grouping on the card sorting tab widget are highly proposed.
Finally, as mentioned in Chapter 2, the knowledge elicitation techniques are not
only limited in card sorting and laddering, but there are also many other options like
diagram tool, repertory grid tool and matrix tool can also be deployed as other
tab-widgets. By doing this, the functionalities of this plug-in will be considerably
enhanced.
For future work, tightly collaboration with the CO-ODE and Protégé community is
essential, and alternatively it is possible to use software public license and build an
open source project to make people join in the development. At this moment, at
least four people in the knowledge engineering research domain are interested in
developing this plug-in and willing to join in the team.
78
References Bechhofer, S., Goble, C., Carr, L., Kampa, S. (2002) COHSE: Semantic Web gives
a Better Deal for the Whole Web? Poster presentation at ISWC International
Semantic Web Conference, Sardinia.
Berners-Lee, T. (1996). The World Wide Web: Past, Present and Future. In IEEE
Computer special issue, v.29 n.10, pp.69-77.
Berners-Lee, T., Fielding, R., Irvine, U.C., Masinter, L. (1998). Uniform Resource
Identifiers (URI): Generic Syntax. IETF Request for Comments: 2396. [Online:
http://www.ietf.org/rfc/rfc2396.txt]. Date accessed: 15th July 2005.
Berners-Lee, T., Hendler, J., Lassila, O. (2001). The Semantic Web. In Scientific
American, 284(5), pp. 34-43.
Benysh, D. V., Koubek, R. J., & Calvez, V. (1993). A comparative review of
knowledge structure measurement techniques for interface design. International
Journal of Human-Computer Interaction, 5, pp. 211-237.
Boose, J. H., & Gaines, B. R., (Eds.) (1990). The Foundations of knowledge
acquisition, Knowledge Based Systems, Vol. 4 San Diego, CA: Academic Press.
Boose, J. H., & Gaines, B. R., (Eds.) (1988). Knowledge acquisition Tools for
Expert Systems, Knowledge Based Systems Vol. 2. San Diego, CA: Academic
Press.
Bray, T., Paoli, J., Sperberg-McQueen, C.M., Maler, E. (2000). Extensible Markup
Language (XML) 1.0 (Second Edition). W3C Recommendation. [Online:
http://www.w3.org/TR/REC-xml]. Date accessed: 15th July 2005.
Brewster, C., Ciravegna, F. & Wilks, Y. (2002). User-Centred Onlology Learning
for Knowledge Management. In Proceedings 7th International Workshop on
Applications of Natural Language to Information Systems, Stockholm.
79
Brewster, C., Ciravegna, F. & Wilks, Y. (2003). Background and Foreground
Knowledge in Dynamic Ontology Construction: Viewing Text as Knowledge
Maintenance. In Proceedings Proceedings of the Semantic Web Workshop, SIGIR,
Toronto, Canada.
Brickley, D., Guha, R.V. (2002) RDF Vocabulary Description Language 1.0: RDF
Schema. W3C Working Draft. [Online: http://www.w3.org/TR/rdf-schema/]. Date
accessed: 15th July 2005.
Carr, L., Bechhofer, S., Goble, C., Hall, W. (2001). Conceptual Linking:
Ontology-based Open Hypermedia. WWW10, Tenth World Wide Web Conference,
Hong Kong.
Ciravegna, F. (2001). Adaptive Information Extraction from Text by Rule Induction
and Generalisation. In Proceedings 17th International Joint Conference on Artificial
Intelligence (IJCAI 2001), Seattle.
Cooke, N. J. (1994). Varieties of knowledge elicitation techniques. International
Journal of Human-Computer Studies, 41, pp. 801-849.
Conolly, D., Harmelen, F. van, Horrocks, I., McGuinness, D., Patel-Schneider, P.F.,
Stein, L.A. (2001) Annotated DAML+OIL Ontology Markup. W3C Note. [Online:
http://www.w3.org/TR/daml+oil-walkthru/]. Date accessed: 15th July 2005.
Corbridge, C., Rugg, G., Major, N.P., Shadbolt, N.R. & Burton, A.M. (1994)
Laddering: Technique and Tool Use in knowledge acquisition. Knowledge
acquisition, 6, pp. 315-341.
Dix, A., Finlay, J., Abowd G., Beale, R. (2004). Human-Computer Interaction,
Third Edition. Prentice Hall. Upper Saddle River, NJ, USA. ISBN: 0-13-437211-5.
Gamma, E., Helm, R., Johnson, R. & Vlissides, J. (1995). Design Patttems:
Elements of Reusable Object-Oriented Software. Addison-Wesley. Boston, MA,
80
USA. ISBN: 0-201-63361-2.
Fernandez, M., Gomez-Perez, A., Pazos Sierra, A., Pazos Sierra, J. (1999). Building
a Chemical Ontology Using METHONTOLOGY and the Ontology Design
Environment. In IEEE Expert (Intelligent Systems and Their Applications), 14(1):
pp. 37-46.
Glaser, R., & Chi, M. T. H. (1988). Overview. In M. T. H. Chi, R. Glaser, and M. J.
Farr (Eds.), The nature of expertise (pp. xv-xxviii). Hillsdale, NJ: Erlbaum.
Gruber, T.R. (1993). Toward Principles for the Design of Ontologies Used for
Knowledge Sharing. Formal Ontology in Conceptual Analysis and Knowledge
Representation. Kluwer Academic Publishers.
Haarslev, V. & Möller R. (2001). RACER System Description. In R. Goré, A.
Leitsch, and T. Nipkow, editors, International Joint Conference on Automated
Reasoning, IJCAR’2001, June 18-23, Siena, Italy, pages 701–705. Springer-Verlag.
Hinkle, D. (1980). The change of personal constructs from the viewpoint of a
theory of construct implications. Unpublished Ph.D Thesis, Ohio State University,
1965 Cited in: Bannister, D. & Fransella, F. Inquiring Man. Penguin,
Harmondsworth.
Hoffman, R. R. (1987). The problem of extracting the knowledge of experts from
the perspective of experimental psychology. AI Magazine, 8, pp. 53-67.
Horrocks, I. (1998). The FaCT system. Proc. Automated Reasoning with Analytic
Tableaux and Related Methods: Int’l Conf. Tableaux 98, Lecture Notes in Artificial
Intelligence, no. 1397, Springer-Verlag, Berlin, 1998, pp. 307–312.
Kenney, J. F. and Keeping, E. S. (1962). “The Standard Deviation” and
“Calculation of the Standard Deviation.” §6.5-6.6 in Mathematics of Statistics, Pt. 1,
3rd ed. Princeton, NJ: Van Nostrand, pp. 77-80.
81
Knublauch, H., Musen, M. A., Rector, A. (2004). Editing Description Logic
Ontologies with the Protégé OWL Plugin. International Workshop on Description
Logics - DL2004, Whistler, BC, Canada.
Lambrix, P., Habbouche, M. & Pérez, M. (2002) Evaluation of ontology
development tools for bioinformatics. Bioinformatics Vol. 19 no. 12 2003, pp.
1564–1571.
Lassila, O., Swick, R.R. (1999). Resource Description Framework (RDF) Model
and Syntax Specification. W3C Recommendation. [Online:
http://www.w3.org/TR/REC-rdf-syntax/]. Date accessed: 17th July 2005.
Milton, N. (2005) PCPACK. [Online:
http://www.epistemics.co.uk/Notes/55-0-0.htm]. Date accessed: 29th April 2005.
McBride, B. (2002). Jena: a semantic Web toolkit. Internet Computing, IEEE,
Volume: 6, Issue: 6, pp. 55 - 59. ISSN: 1089-7801.
McNicol, K. (2005). Comments on the knowledge elicitation Plug-in. Personal
commmunications. Windermere, Lancashire, UK.
Mozilla.org. (2005). Firefox Browser. [Online:
http://www.mozilla.org/products/firefox/]. Date accessed: 15th July 2005.
Musen, M. A., Fergerson, R. W.,Grosso, W. E.,Noy, N. F.,Crubezy, M., & Gennari,
J. H. (2000) Component-Based Support for Building Knowledge-Acquisition
Systems. Conference on Intelligent Information Processing (IIP 2000) of the
International Federation for Information Processing World Computer Congress
(WCC 2000), Beijing.
Nielsen, J. (1995). Card sorting to Discover the Users' Model of the Information
Space. [Online: http://www.useit.com/papers/sun/cardsort.html]. Date accessed:
15th July 2005.
82
Nurmuliani, N., Zowghi, D., Williams, S. P. (2004). Using Card Sorting Technique
to Classify Requirements Change. Proceedings of the 12th IEEE International
Requirements Engineering Conference.
Norman, D.A. & S.W. Draper, (E&). (1986). User Centered System Design - New
perspectives on Human Computer Interactwn. Lawrence Erlbaum Associates:
Hillsdale, NJ.
Noy, N. F., Sintek, M., Decker, S., Crubezy, M., Fergerson, R. W. & Musen M. A.
(2001). Creating Semantic Web Contents with Protege-2000. IEEE Intelligent
Systems 16(2): 60-71.
Pressman, R.S. (1997). Software Engineering: A Practitioner’s Approach, McGraw
Hill, New York.
Python Patterns - Implementing Graphs. (2003). Retrieved January 20, 2003.
[Online: www.python.org/docs/essays/graphs.html]. Date accessed: 21th July 2005.
Rauterberg, M. (2003). User Centered Design: What, Why, and When. tekom;
Jahrestagung 2003 (E. Graefe; ed.), Usability Forum, pp. 175-178.
Rector, A. (2002). CO-ODE: Collaborative Open Ontology Development
Environment. Proposal to JISC under the Semantic Web Initiative.
Reynolds, T. J. & Gutman, J. (1988). Laddering theory, method, analysis, and
interpretation. Journal of Advertising Research, February-March 1988, pp. 11-31.
Rugg, G., Corbridge, C., Major, N.P., Shadbolt, N.R. & Burton, A.M. (1992). A
comparison of Sorting Techniques in knowledge elicitation. Knowledge acquisition,
vol. 4, pp. 279-291.
Rugg, G., Eva, M., Mahmood, A., Rehman, N., Andrews S., & Davies, S. (2002).
Eliciting information about organizational culture via laddering. Info Systems J
(2002) 12, pp. 215–229.
83
Rugg, G., & McGeorge, P. (2002). Eliciting Hierarchical Knowledge Structures:
Laddering. Encyclopedia of Microcomputers, vol. 28, supplement 7, pp. 69-110
Marcel Dekker, Inc, New York.
Rugg, G., & McGeorge, P. (1995). Laddering. Expert Systems, 12, 339–346.
Rugg, G., & McGeorge, P. (1997). The Sorting Techniques: A Tutorial Paper on
Card Sorts, Picture Sorts and Item sorts. Expert Systems, vol. 14, pp. 80 - 93.
Shadbolt, N. (ed.) (2003b) Advanced Knowledge Technologies: Selected Papers,
ISBN 0854-327932.
Shadbolt, N.R. & Burton, M. (1995). Knowledge elicitation: a systematic approach,
in Evaluation of human work: A practical ergonomics methodology 2nd Edition J.
R. Wilson and E. N. Corlett Eds, Taylor and Francis, London, England, 1995.
pp.406-440. ISBN-07484-0084-2.
Schneider, P., Horrocks, I., Harmelen, F. (2002). OWL Web Ontology Language
1.0 Abstract Syntax. W3C Working Draft. [Online:
http://www.w3.org/TR/2002/WD-owl-absyn-20020729/]. Date accessed: 5th June
2005.
Schreiber, A.Th., Akkermans, H., Anjewierden, A., de Hoog, R., Shadbolt, N., van
de Velde, W., & Wielinga, B.J. (2000). Knowledge Engineering and Management:
CommonKADS Method, MIT Press, Cambridge.
Shneiderman B. (1998). Designing the User Interface 3rd Edition, Addison Wesley,
Reading, Massachusetts.
Sommerville I. & Sawyer, P. (1997) REQUIREMENTS ENGINEERING John
Wiley & Sons, Chichester/New York/Weinheim/Brisbane/Singapore/Toronto.
Sun Microsystems, Inc. (2005). Java Technology Overview. [Online:
84
http://java.sun.com/overview.html]. Date accessed: 10th May 2005.
Tedeschi, B (1999). Good Web Site Design Can Lead to Healthy Sales. New York
Times, August 30, 1999.
Wang, Y., Rector, A., Stevens, R. (2005) Discussion of knowledge elicitation
Plug-in Deisgn. M.Sc. Project Meeting. Manchester, UK.
Wansink, B. (2003). Using laddering to understand and leverage a brand’s equity.
Qualitative Market Research: An International Journal Volume 6, Number 2, pp.
111-118.
yWorks. (2005). yDoc: A Javadoc UML Extension. [Online:
http://www.yworks.com/en/products_ydoc.htm]. Date accessed: 26th July 2005.
85
Appendix A. The Interviewees’ Profile There are eleven people involved in the software testing and evaluation procedure.
For the purpose of protecting the personal data, the profile is only shown as groups.
Fields of expertise
There are 5 people working in the Knowledge Management related domain, 3
people have Computer Science major but focus on other field, and 2 people is
working on Biological Science, as well as a person with business study
background.
Cultural Backgrounds
There are 7 Chinese people and 4 western people participated in the interviews.
Years of using computer
8 Computer Science people have used computer for over 7 years, and 2 of them
have a more than 15 years experience in using computer. Other 3 people just
began to use computer when they were college student.
Operating system preference
5 people would like to use Unix – like system, and 3 of them are western
people. 6 people choose Windows and 1 person prefer to use Mac OS.
Gender
There are 3 female and 8 male interviewees.
86
B. User Testing and Evaluation Results Three typical testing and evaluation results will be introduced here.
B.1 Results from Knowledge Management Community The results are taken from the 5 people from Knowledge Management community
and contribute the first 5 column of the grades array in Chapter 6.
Gender: Male
Major: B.Sc. Informatics; M.Sc. Bioinformatics; Ph.D. Computer Science
Ethic: Western, Chinese
Suggestions after testing:
Grouping cards on the panel
Make comments on logs
Stable output like OWL
Moving the cards here and there, visually, grouping just on the panel
Drag and drop
Relationships other than trees.
Interface evaluation:
Look and feel. Grades: 9 7 7 8 9
Interface layout. Grades: 7 7 9 6 8
Ease of use. Grades: 7 7 6 7 8
Flexibility. Grades 7 8 8 6 5
Functional evaluation:
Card sorting. Grades: 9 8 7 9 9
Laddering. Grades: 7 6 7 7 7
Relationship setting. Grades: 6 7 6 6 6
Transaction manager. Grades: 9 9 8 8 9
Knowledge Management community people’s real work:
Begin with excel file
87
First thing is things
Make words fuller, complete words automatically
Using Microsoft Visio to lay cards
Show small piles in big piles
Moving the cards to new groups
Hierarchically show the cards
Different relationships, different colors
A task panel for working. Like visa, put things altogether.
B.2 Results from Computer Science Students The results contribute the 6th column of the grades array in Chapter 6. Gender: Male
Major: M.Sc. Advanced Computer Science
Ethic: Chinese
Suggestions on laddering and card sorting tool:
A message box should appear while add group tree concepts to ladder
Reset button reminder
Multiple selection to add to the group by ALT+click
Node selection mode, blank mode message warning
Give feedback about how to select words
Step track back
Bigger trees
Interface evaluation:
Look and feel. Grades: 7
Interface layout. Grades: 6
Ease of use. Grades: 7
Flexibility. Grades 6
Functional evaluation:
Card sorting. Grades: 7
Laddering. Grades: 7
Relationship setting. Grades: 8
88
Transaction manager. Grades: 8
B.3 Results from Business Student The results contribute the 6th column of the grades array in Chapter 6. Gender: Female
Major: M.Sc. Finance
Ethic: Chinese
Suggestions on laddering and card sorting tool:
In the card sorting panel, right button menu to rename the card
Document Elicitation could automatically remove blank characters
After loading status, the color should change back to yellow
Save/Load status exceptions.
Direct grouping by single click selection while press ALT
Edit ladder from right click popup menu
Give feedback while click the ladder settings by mistake.
Interface evaluation:
Look and feel. Grades: 7
Interface layout. Grades: 9
Ease of use. Grades: 6
Flexibility. Grades 6
Functional evaluation:
Card sorting. Grades: 8
Laddering. Grades: 6
Relationship setting. Grades: 7
Transaction manager. Grades: 7
89
C. Software Setup The first task is to install Java compiler, we are using Java 2 Platform Standard
Edition Development Kit 5.0 (JDK 5.0) (Sun Microsystems, 2005) as the
development toolkit. This toolkit can be downloaded from:
http://java.sun.com/j2se/1.5.0/download.jsp
After download and installing the JDK 5.0, we can run Java programs via Java
Virtual Machine (JVM) provided by this toolkit.
The second crucial software support required by this project is the Protégé system.
The current release is Protégé 2000 version 3.1, which can be obtained from:
http://protege.stanford.edu/download/download.html
The Protégé installation is quite ordinary - the users will download the binary
executable files. Another API commonly required by the Semantic Web related
applications is Jena (McBride, 2002). The download page and guideline can be
found at:
http://jena.sourceforge.net/
For this project, the users can extract the required packages directly to the root path
so that we can compile and run the plug-in easily. After all the packages are
installed and ready to be loaded, we can either execute the Java code in the IDE
development environment or just system console by defining the parameters
required by each package. A typical execution command typed below is long and
complicated, which is suggested to be executed via a written executable script
document.
java -Dprotege.dir=C:\Protege_3.1_beta
-classpath “C:\Protege_3.1_beta\plugins\uk.ac.man.cs.mig.coode.ke;
C:\Jena-2.2\lib\xercesImpl.jar;C:\Jena-2.2\lib\jena.jar;
C:\Jena-2.2\lib\junit.jar;C:\Jena-2.2\lib\jakarta-oro-2.0.5.jar;
C:\Jena-2.2\lib\commons-logging.jar;C:\Jena-2.2\lib\log4j-1.2.7.jar;
C:\Jena-2.2\lib\xml-apis.jar;C:\Jena-2.2\lib\antlr.jar;
C:\Jena-2.2\lib;C:\Jena-2.2\lib\concurrent.jar;
C:\Jena-2.2\lib\icu4j.jar;C:\Protege_3.1_beta\looks.jar;
C:\Protege_3.1_beta\protege.jar;C:\Protege_3.1_beta\unicode_panel.jar”
KE
90
If this knowledge elicitation plug-in is successfully integrated into the Protégé
system, the user will find an option to tick this plug-in in the Protégé plug-in list as
screenshot follows. After ticking this plug-in, a tab-widget will appear beside the
existing widgets, and now we can perform tasks on it.
Figure A. 1: Snapshot of Choosing KEToolTab