Upload
vuhanh
View
215
Download
0
Embed Size (px)
Citation preview
European Union Training Application (EU Tap)
The GCSE training program for geography students interested in the European Union
By Rosaleen Hegarty, (ID: 99606887)Sonya Conlon, (ID: 41615103)Elijah Blyth (ID: 41754903)
Abstract: The field of Intelligent Multimedia integrates the use of multimodal input in aiding human computer interaction and addressing correspondence. Educational systems are designed to help with learning, making the process more effective and fun. Traditional approaches based on computing stemmed from the use of “Text book style learning” where the information was given to the user on screen instead of book format. Later attempts included simple pictorial information and sound. This report outlines the design and implementation of CSLU toolkit based European Union training and testing program with GCSE students in mind, that takes as its inputs and outputs both Gestural (point and click), Image based, textual (through Natural Language Processing) and verbal modalities. The aim of the report is to design and implement a system that helps make the learning process both interactive and fun.
Key words: Natural Language Processing, E-Learning, multimedia, Natural Language Generation, Fun
Table of Contents
1. INTRODUCTION
1.1 Introduction
1.2 Aims and Objectives
2 BACKGROUND
2.1 Introduction
2.2 What is Intelligent Multimedia?
2.3 Why Multimodal Systems?
2.4 The CSLU Toolkit
2.4.1 Tool Command Language (TCL)
2.4.2 Speech Recognition, Generation, and Facial Animation
2.4.3 Rapid Application Development Environment
2.5 E-Learning
2.5.1 Benefits of E-Learning
2.6 Related Papers
3. ANALYSIS
3.1 Introduction
3.2 Analysis of similar systems
3.3 System Requirements
2.3.1 Non-Functional
2.3.2 Functional
3.4 User Requirements
3.4.1 Non-Functional
3.4.2 Functional
3.5 Hardware Requirements
3.6 Software Requirements
4. DESIGN
4.1 Introduction
4.2 Application Architecture
4.3 Human Computer Interaction Guidelines
4.3.1 Consistency
4.3.2 Compatibility with User Expectations
4.3.3 Flexibility and Control
4.3.4 Error Prevention and Correction
4.3.5 Continuous and Informative Feedback
4.3.6 Visual Clarity
4.3.7 Relevance of Information
4.4 Unified Modelling Language
5. IMPLEMENTATION
5.1 Introduction
5.2 European Union EU Tap Demo
5.3 Implementation Stage
6. TESTING
6.1 Introduction
6.2 White, Black and Grey Box Testing
6.2.1 White Box Testing
6.2.2 Black Box Testing
6.2.3 Grey Box Testing
6.3 Format
6.4 Forms
6.5 Debugging
6.6 Conclusion
7. CONCLUSION
7.1 Introduction
7.2 Results
7.3 Critical Analysis
7.4 Future Developments
8. REFERANCES
9. APPENDICES
Transcripts of Test Runs of System
Bug Tracking Form for the EU Tap System
Program Scripts
Table of Figures
Page 9 - Figure 1: Computational Model for integrating linguistic and pictorial informationPage 11 - Figure 2: The CSLU Toolkit OverviewPage 12 - Figure 3: Example of the CSLU RAD interface Figure 4: Bug Tracking System or BTS used in this project Figure 5: Use Cases are used to develop the static and dynamic object models
Chapter 1
Introduction
“Until comparatively recently, Artificial Intelligence was seen as the concern of only the most advanced researchers in computer science. Now, largely due to the falling costs of computers and
silicon chips, the ‘Intelligence’ of information technology equipment is continually being increased in practical terms.”
(Aleksander, I. 1984)
1.1 Introduction
Many attempts have been made to create and develop electronic educational training
programs that are both natural and easy to use. In the past these attempts have ranged
from simple textual systems giving the information out, then the lecturer handing out
written or typed exams on the subject (this will be referred to as the traditional
approach). Further attempts incorporated the use of point and click systems of
informational recovery, and pictorial references, while still incorporating the use of
text as the main information display technique, Microsoft’s Encarta Encyclopaedia
while not directly an educational tool, is one such example. There have been some
educational tools developed for computer systems that teach a language or range of
languages that have used audio output extensively; however none to date have
incorporated language input. Intelligent Multimedia gives the programmers of these
systems a new approach to improve the usability, and also possibly the effectiveness
of the system, through the incorporation of more than one or two modalities in these
“E-Learning Systems”. It is understood now that in order for learning to progress in
an efficient manner in the classroom, the students should be shown not only textual
information, but also pictorial, sound, and even movement. For this reason this paper
shall attempt to show how these important modalities can be incorporated, more
easily than in a classroom, into the European Union E-Learning system outlined, and
thus improve not only the user experience but also the effectiveness of the system and
learning.
The type of Diexis being used in this paper is known as Demonstratio Ad Oculus, this
is due to the fact that the objects on display are visually observable (i.e., already been
introduced) and the user and the system share a common visual field (i.e., the map of
Europe). Therefore complex visual systems and overly long audio options are
avoided, due to the commonalities shared by the user and the system.
This report shall contain Background information regarding E-Learning Systems,
Intelligent Multimedia, and previously related papers on the subject. The Background
section shall also cover an introduction to the CSLU Toolkit, and its components.
The Analysis chapter shall address the user and system requirements, both functional
and non-functional, and also the hardware and software requirements that are
necessary for a user to ensure optimum efficiency and low error count within the
system. The design chapter shall introduce the Application Architecture, and the
Human Computer Guidelines that affect the design of the system, with any other
factors that were taken into account during the design of the system. The
Implementation and Testing chapters shall cover how the CSLU toolkit was utilised to
satisfy the aims and objectives of the system based on the design, and also how this
system was tested, debugged and what methodologies were used when testing to
ensure a satisfactory level of testing was carried out. These sections shall also include
a basic description of the different methodologies of testing that are available for use,
the format of the testing, and the forms used to track errors, bugs, and other issues
(the Bug Tracking System used). The final section will cover our conclusion and
critical analysis, and finally the future changes and additions to the project.
1.2 Aims and Objectives
“Multimodal speech and gesture—interfaces are promising alternatives to desktop input devices and WIMP (windows, icons, menus, pointer) metaphors. They provide a wide spectrum of appropriate
interaction ranging from gesture-based direct manipulation to distant multimodal instruction or even discourse based communication with artificial humans”
(Marc Erich Latoschik 2005)
The aim of this system is to include not only the traditional techniques of visual
outputs (Text and Pictorial references) but also include the audio outputs used in the
language training tools. This gives the student a personal feel for the information
being imparted. Another aspect of this system will be the ability for the student to
learn at their own pace, choosing when to repeat information, when to continue to the
next section, and a “Pick and Choose” methodology where by the student can choose
to only hear information appropriate to what they require (for example Geographical,
Economic, Etc.). The final aspect of the system will be the optional “Test” section,
where the student is asked if they wish to answer a series of questions based on the
information they have just been shown/heard, and also give them the opportunity to
go back if they find an area that they feel that they don’t know enough about. The
systems user interface will be similar in layout to a map system, where by the user
will be presented with a map of Europe and asked to select a country, the user shall
then be shown a map of the country they have selected, at every point in the program
the user will have the option of returning up a level, and also be given a “Quit” word,
where they can quit out of the program at any point, this system should also include a
“Pause” word where a student can pause if they need a break and pick up exactly
where they left off, thereby giving the student full control over their learning process.
Traditional approaches also tend to force the student to go at paces that can be either
too quick, or too slow for their needs, the aim of this system is to give the power to
the student to control the rate of learning.
The main aim of any learning system is to appear intelligent and reactive to the
individual, while remaining accessible to the group, in this case students. Therefore,
the underlying aim of this paper is to show that the incorporation of the discussed
modalities, and the method with which they are incorporated, can give such systems
greater advantages over similar systems developed according to the “Traditional
Approach” of simply showing the text to the screen. The Ultimate aim of this paper is
to reach a system that can not only be used by individuals, but also by groups, or even
entire classes, where one or more speakers interact with the system, other users give
information, and ask questions. The system will ultimately be able to answer direct
questions based on the domain knowledge of the system, as well as ask questions
pertinent to the domain.
Chapter 2
Background
“One of the Main reasons why it is so difficult for computers to understand natural language (and indeed visual representations) is that understanding requires many sources of knowledge, including
knowledge about the context of the communication, and general ‘Common Sense’ knowledge shared by speaker and hearer”
(Nilson 1998)
2.1 Introduction
In this chapter, topics covered include the background information used in the
decision to choose the EU Tap GCSE Training Program. Also covered will be a brief
introduction to the field of Intelligent Multimedia, the reasons behind the choice of a
multimodal system, the CSLU toolkit, the field of E-Learning, some related research
in the field of Intelligent Multimedia, as well as some previous E-Learning systems
that use Multimodal Interfaces.
What is being conveyed in this section is the reasoning behind the design of the
system outlined in the paper, and the background to the ideas used, as well as an
explanation of the category this paper falls into.
2.2 What is Intelligent Multimedia?
“Multimedia systems are those which integrate data from various media (e.g., paper, electronic, audio, video) as well as various modalities (e.g., text, diagrams, photographs) in order to present information
more effectively to the user”Maybury (1993)
The field of Intelligent Multimedia concerns itself with the combining of multiple
modalities (Vision, Audio, Text, Gestures etc) in order for a system to exhibit
intelligent behaviour (Winograd 1973, Waltz 1981, Srihari 1995). The field also
attempts to form methodologies and theories for the consolidating of these modalities
in ways that lead not only to better understanding for the user of the systems, but also
better understanding between the user and the systems themselves. A typical system
that shows the different modules needed for a system that needs both text and image
is below (Srihari 1995):
Text Images
Syntacticparsing and
limited semantic Interpretation
VisualModels
LanguageModels
Pre-Processing and Limited imageInterpretation
(No Use of scene specific information)
Initial InterpretationOf text
Initial InterpretationOf imageHypothesis
GeneratorFurther
SemanticInterpretation
Of text
FurtherScene analysis
Hypothesis ofImage/textContents
Intermediate RepresentationReflecting consolidation
Of information fromLanguage and visionTextual Output
Pictorial Output
Action in Physical world
Figure 1 Computational Model for integrating linguistic and pictorial information
One of the main problems Intelligent Multimedia attempts to solve is the
correspondence issue, namely the combining of textual, pictorial, and other modalities
into one combined meaning. The issue being that Pictorial, Textual, and Gestural
information can be ambiguous, and even after combination with other modalities, the
overall meaning could be lost without background knowledge bases.
Wittgenstein (1889-1951) describes this problem in terms of “Family Resemblances”,
not between family members such as sisters and brothers, but the resemblances
between the knowledge bases of individuals in similar situations or professions. The
most famous example is that of the Builders, when Builder A says “Go and fetch me a
slab”, Builder B knows exactly the type and colour of slab that builder A is asking for,
since the builders have been working on the same building site, the second has also
seen the first working with a certain type of slab, and doesn’t need to ask where to get
the slab either. However, should a Mortician come on to the building site, and
Knowledgebase
Builder A asks the same question, the Mortician would have a different family
resemblance to the builder and would have to ask questions like “What kind of slab, a
slab of toffee, a Concrete slab, or a mortician’s slab, or some other form of slab?” the
mortician would also have to ask “What type of slab, a round white one, a square blue
slab, a brown heavy slab, or a light pink slab etc.?” The mortician would also have to
ask where to get the slab from, since the mortician wouldn’t have the background
knowledge to know that the slabs are all kept in one place. The task of Intelligent
Multimedia is to translate the family resemblances of the user and the system being
developed in such a way as to allow the natural and comfortable usage of the system.
2.3 Why Multimodal Systems?
“Current computing systems do not support human work effectively. They restrict human-computer interaction to one mode at a time and are designed with an assumption that use will be by individuals (rather than groups), directing (rather than interacting with) the system. To support the ways in which humans work and interact, a new paradigm for computing is required that is multimodal, rather than
unimodal, collaborative, rather than personal, and dialogue-enabled, rather than unidirectional.”(MacEchren et al. 2004)
The above quote describe the problem that is attempted by this paper also, where as
this paper shall concern itself with E-Learning as opposed to Collaborative
Geoinformation Access, however the problems outlined are similar in both.
Traditional systems for E-Learning concerned themselves with a design for an
individual, rather than a group, and the use of single modalities for the information.
In class rooms however it has become accepted that simply giving a student a text
book, or simply reading the information out wont teach the student in the best way, in
fact the use of several modalities have been introduced into classrooms over the last
50 or so years. Therefore, why should computer systems only use one or two
modalities? This is one of the main reasons behind this system, the idea that several
modalities can be used to better portray the information to the student. Also, class
rooms rarely consist of a single student and a lecturer, they often have several
students all talking and interacting with the lecturer (groups personal experience),
therefore this system attempts to allow several students interact with the system, and
encourages students to talk with one another about the subject area, since often the
best people to explain information to a student is another student. For this reason at
key points within the program there are “Discussion Points”, where a student or
teacher can turn to another student and get their take on the information being
presented.
A multimodal system gives students a range of inputs, and since human learning and
memory works through association, these multimodal systems should facilitate
learning far more successfully than a system that relies on a single modality i.e. text,
or audio etc. (Sowa 1991).
2.4 The CSLU Toolkit
“The CSLU Toolkit was created to provide the basic framework and tools for people to build, investigate and use interactive language systems. These systems incorporate leading-edge speech
recognition, natural language understanding, speech synthesis and facial animation technologies. The toolkit provides a comprehensive, powerful and flexible environment for building interactive language
systems that use these technologies and for conducting research to improve them”(http://cslu.cse.ogi.edu/toolkit/docs 2006)
Figure 2 The CSLU Toolkit Overview
The CSLU Toolkit is an open source project undertaken by the Oregon Graduate
Institute of Science and technology that encompasses several technologies to present a
package that allows the Rapid Development of applications capable of voice
recognition, voice generation, multimodal input and output, the package also includes
an open source easy to use programming language (TCL). The package enables the
user to create and use fully working applications built within the package, and enables
the fast prototyping of such applications at a presentation level. In the following
sections the main components of the toolkit are described, with brief explanations as
to their use in this project.
The CSLU toolkit also uses a RAD or Rapid Application Development style of
prototyping and even Full Application creation. This is done by the use of Icons
representing the different modules being placed on a screen like so:
Figure 3 Example of the CSLU RAD interface
2.4.1 Tool Command Language (TCL)
“TCL (Tool Command Language) is a very powerful but easy to learn dynamic programming language, suitable for a very wide range of uses, including web and desktop applications, networking,
administration, testing and many more. Open source and business-friendly, TCL is a mature yet evolving language that is truly cross platform, easily deployed and highly extensible“
(http://www.tcl.tk/ 2006)
TCL is the command line programming language used in the CSLU toolkit, this
language an open source programming language similar in design to the Pro-log based
“Pop-11” (partially designed by Aaron Sloman of Birmingham Universities AI
department 1996) and offers much of the same functionality (although not as wide
ranging), and being the main command line language of the CSLU toolkit, interfaces
with the CSLU allowing far more flexibility in the design of software on the toolkit.
The TCL and associated TK languages are designed for use with wide ranging
applications such as Web and desktop applications, network programming, general
purpose programming, system administration and a wide range of other such
applications. The TCL language is specifically designed for RAD development, since
the language is very user friendly, and similar to human language in structure (an
example would be an If – Else – Then statement that is just that).
TCL has been used by both researchers and industry to help Rapid development of
applications; some examples include use by AOL's Digital City to Cisco Systems, and
of course the CSLU toolkit.
2.4.2 Speech Recognition, Generation, and Facial Animation
“Festival offers a general framework for building speech synthesis systems as well as including examples of various modules. As a whole it offers full text to speech through a number APIs: from shell
level, though a Scheme command interpreter, as a C++ library, and an Emacs interface. Festival is multi-lingual, we have developed voices in many languages including English (UK and US), Spanish
and Welsh, though English is the most advanced.”(http://www.cstr.ed.ac.uk/projects/festival/manual/festival_1.html#SEC1 1999)
The speech recognition and generation is performed by the Festival and Baldi
subprograms, these programs are designed to run in the background and interact with
the environment in an unseen manner, enabling the use of these systems without
having to integrate the different tools needed for such a task. The speech generation
is an essential part of this papers program and interface, since the students will be able
to see and hear the information being read out by the program. This makes the
interaction between the user and the system as natural as possible.
Another area that we felt important was to give the computer a human face for the
student to interact with, and the CSLU toolkit provided this through the Baldi system
that was discussed above. This enables the text to not only be read out, and a face be
present, but also enables the user to see the face speak the lines and have the mouth
move appropriately with each word, the face also allowed for the use of emotion as
part of the text.
2.4.3 Rapid Application Development Environment
The environment provides a dialogue management system, Graphical user interface
and canvas with learning system. Robust parsing with natural-language understanding
integrated. It also incorporates Language Application Wizards for the profoundly
deaf.
Base Objects
Basic objects that ship as part of the CSLU Toolkit, and can all be used in telephony
applications, an example of these is shown in figure 4 and figure 5.
Action Alpha-Digit Conditional
Digit DTMF Enter
Exit Generic Goodbye
Keyword Start Subdialogue
Figure 4 taken from http://www.cslu.ogi.edu/toolkit
Tucker-Maxon Objects
These objects are part of the Tucker-Maxon plugin. They were developed for use in
the classroom, and enable some multimedia applications.
Generic List Listbuilder Login
Media Randomizer
Figure 5 taken from http://www.cslu.ogi.edu/toolkit
2.5 E-Learning
“E-Learning is the use of Internet and digital technologies to create experiences that educate our fellow human beings.”
(Horton, 2001)
E-Learning uses HCI concepts to design the user interface, it uses multimedia
techniques to display the information and it employs a high level of interaction
between the user and the software. The software can be downloaded from the Internet
and run on any home system. It is software that can be accessed and used by any
student anywhere in the world. It is a development that will be enhanced in years to
come, allowing students participating in various courses to avail of e-learning
software.
The fundamental benefits of e-learning were displayed when Stanford University and
IBM joined forces in the late 1950’s. Mainframe computers were used to distribute
exercises. Although this did not entirely prove to be a very successful experiment, it
did show that there was a niche there for computer-aided learning. According to
(Miltiadis et al, 2005), “In order to be accurate, let us mention that the application of
e-learning began in the early 1960’s, when psychologists and pedagogues noticed the
educational potentials of computers while developers discovered the possibilities of
their application.”
The real breakthrough came when PLATO (Programmed Logic for Automated
Teaching Operations) systems were introduced and used in the 1960’s; in fact they are
still used today. PLATO systems required a mainframe and other networked
terminals, which was expensive. PLATO then progressed to PC-based systems which
added further expense to this e-learning technology. This is an example of software
struggling to keep up with hardware technology. According to the PLATO website1
“The authoring language of the PLATO mainframe system was TUTOR, written by
Paul Tenczar in 1967. TUTOR was designed to have the power of a general-purpose
language such as BASIC or PASCAL, but it had additional capabilities to support
instructionally important features such as interactive vector graphics and real-time
online transaction processing capabilities such as free-response answer analysis and
feedback.”
The concept of e-learning is by no means a revolutionary idea. The concept has been
around since the late 1950’s. The need for interactivity was realised and incorporated
into the TUTOR system in 1967. The key work here though is the Internet; it allowed
advanced graphics and a means by which training could be accessed globally. The
creation of CD’s meant that training could be stored in one single disk, eliminating
the problem of having to store software on countless numbers of floppy disks. HTML
(Hypertext Mark-up Language) provided a world wide accepted format to allow text
and graphics to be viewed by the world.
2.5.1 Benefits of E-Learning
There are various benefits of e-learning, it can free up the teachers time to aid special
needs students, and it can give the teacher an immediate insight into the students’
progress by studying their marks in e-learning test stimulation. It allows a student to
log on to their computer anywhere in the world and access the e-learning software.
However there are a lot more benefits to e-learning. Re-usable components can be
developed when creating e-learning software. Teachers may find that certain aspects
of courses overlap when teaching various courses. E-Learning software that currently
exists can be re-used when teaching the specific lessons.
Students who are more capable can move on to the following week’s lessons. They
do not have to wait for other students in their class to reach the same level as
themselves before they can move on. On the other hand slower students can take their
time and work through the e-learning system at their own pace. It encourages
students to spend extra time researching the given topic by participating in chat rooms
and by listing web addresses they can visit, encouraging students to research the topic
1 http://wwwplato.comcommunityroadmap200301research2.html
in more detail. Students will be more inclined to click a button to read more
information than walk to a library to read a book on the topic.
Students learning times are not restricted to the set time allocated in a classroom.
They can log on at any time and learn they do not have to wait until the ‘classroom
bell has rung’. Teachers can continuously assess how successful or not the e-learning
package is. The software can then be improved to get the best results. Students also
tend to find some teachers vocal tones dull. However, by using a fully interactive
system it encourages the student to remain attentive.
Students who may require learning material to be structured in a certain format e.g.
dyslexic users can benefit greatly from e-learning packages. The specific format they
require information to be delivered to them can be accommodated for when designing
and implementing an e-learning package. Lecturers can eliminate travelling and
leaving their homes and families, very often they are doomed to a life of jet setting.
Developing e-learning software would allow the lecturer to remain at home. It is not
only educational institutions that benefits from e-learning. Companies can also avail
of this learning technique.
A company can use an e-learning package to train staff. This would shorten the time
taken to train staff and ensure everyone receives the same training. Various
companies rely on a few members of staff to train new employees, very often new
trainees receive different training, depending on what training day they attended and
who ran the training. Having a software package that trains new members of staff
would ensure that they all receive the same training and allow a company to easily
expand the e-learning software to cater for new training needs and developments.
Companies continuously re-structure to improve profits, remain competitive and to
possibly produce and sell new products. The whole company is affected by company
change and new training must be given to all members of staff. This could mean team
leaders having to travel to attend new training sessions. The creation of an e-learning
package would save companies from paying for the ‘all expenses paid trips’ to
various locations. Training staff using an e-learning package would also allow
management to gauge their staff’s progress. They would then have a greater insight
into which members of staff would be suitable for promotion.
2.5.2 Emotion and Learning
“It appears that emotions can be powerful in encouraging and inhibiting effective learning and approaches to study, but educational research and models of learning have shed little light on the
interrelationships between emotions and learning.”(Ingleton, C. 1999)
The use of emotion also adds another dimension to the user experience, in that the
results of questions can be shown appropriately through the use of emotion. Emotion
is an effective incentive for students to learn, and the use of encouragement improves
learning capabilities, leads to happier students, and improved long-term memory
recall since the memory works through association. Therefore, it is felt by the authors
that this project should attempt to include emotional incentives through the use of the
inbuilt facial animation software inherent in the CSLU toolkit.
2.5 Related Papers
Below are a series of related papers on the subjects of E-Learning through
Multimodal Interfaces. The first by Walsh et al is a speech enabled E-Learning
system designed for Adult Literacy, this paper shows the importance of speech
enabled learning, and encouraged us to include speech in our system. The second
paper by Laufer et al is a system where by chatting and games are all used to help the
user learn more about the stock market, this paper also went into more depth on the
psychological basis for their decisions, this helped direct us in our decisions with
regards the format of the program, and also showed the importance of fun in learning.
Walsh, P. and Meade, J. (2003), Speech Enabled E-Learning for Adult Literacy Tutoring, Cork Institute of Technology
AbstractIt is estimated in a recent OECD International Adult Literacy Survey that up to 500,000 Irish adults are functionally illiterate, that is many people have difficulty in reading and understanding everyday documents. We address this problem by allowing users to interact with speech enabled e-Learning literacy content using multimodal interfaces. We present two experimental prototypes that explore technical solutions and identify an application architecture suitable for literacy e-Learning. The implementation of an evolutionary prototype
that uses client side technology is described and feedback from this phase of the project is reported.
Laufer, L. and Tatai, G. (2004), Learn, Chat and Play – An ECA Supported Stock Markets E-Learning Curricula, Department of Psychology, ELTE University of Sciences, Department of Computer Science, University College London, Proceedings of the IASTED International Conference Web-Based Education February 16-18, 2004
AbstractIn this paper we are describing our system, VBroker. It is designed for the Hungarian Financial Authority to promote the stock exchange and financial markets in general. The system is incorporating an E-learning curriculum on the subject; an embodied conversational agent and a virtual stock exchange game. We describe the way the system integrates these three main parts together with an emphasis on the analysis of the e-learning scenario and our emotional model. This model enables the tutoring agent to provide proper emotional feedback. In addition we provide some detail about the use of humour, enabling the agent for edutainment. Experiments about the use and transmission of emotions are also shown that users are in favour of multimodal interfaces, and they apply its features so naturally that they do not even remember correctly the number of used emotion signals.
Some other examples that are not as directly related to this project include, but still
affected the way the information was laid out and presented:
Roy, D., and Mukherjee, N. (2005), Towards Situated Speech Understanding: Visual Context Priming of Language Models, Cognitive Machines Group, the Media Laborator, Massachusetts Institute of Technology
Abstract:Fuse is a situated spoken language understanding system that uses visual context to steer the interpretation of speech. Given a visual scene and a spoken description, the system finds the object in the scene that best fits the meaning of the description. To solve this task, Fuse performs speech recognition and visually-grounded language understanding. Rather than treat these two problems separately, knowledge of the visual semantics of language and the specific contents of the visual scene are fused into speech processing. As a result, the system anticipates various ways a person might describe any object in the scene, and uses these predictions to bias the speech recognizer towards likely sequences of words. A dynamic visual attention mechanism is used to focus processing on likely objects within the scene as spoken utterances are processed. Visual attention and language
prediction reinforce one another and converge on interpretations of incoming speech signals which are most consistent with visual context. In evaluations, the introduction of visual context into the speech recognition process results in significantly improved speech recognition and understanding accuracy. The underlying principles of this model may be applied to a wide range of speech understanding problems including mobile and assistive technologies in which contextual information can be sensed and semantically interpreted to bias processing.
Alan Po, B. (2005), a Representational Basis for Human-Computer Interaction, THE UNIVERSITY OF BRITISH COLUMBIA
Abstract:Mental representations form a useful theoretical framework for understanding the integration, separation and mediation of visual perception and motor action from a computational perspective. In the study of human-computer interaction (HCI), knowledge of mental representations could be used to improve the design and evaluation of graphical user interfaces (GUIs) and interactive systems. This thesis presents a representational approach to the study of user performance and shows how the use of mental representations for perception and action complements existing information processing frameworks in HCI. Three major representational theories are highlighted as evidence supporting this approach: (1) the phenomenon of stimulus-response compatibility is examined in relation to directional cursor cues for GUI interaction with mice, pointers, and pens; (2) the functional specialization of the upper and lower visual fields is explored with respect to mouse and touch screen item selection; (3) the two-visual systems hypothesis is studied in the context of distal pointing and visual feedback for large-screen interaction. User interface design guidelines based on each of these representational themes are provided and the broader implications of a representational approach to HCI are discussed with reference to the design and evaluation of interfaces for time- and safety-critical systems, interaction with computer graphics, information visualization, and computer-supported cooperative work.
Chapter 3
Analysis
3.1 Introduction
Analysis of similar systems, the system requirements, both functional and non-
functional, the user requirements, again both functional and non-functional, and
finally the software and hardware requirements of the system are covered by this
chapter. This chapter concerns itself with the Analysis of the system and the problem
at hand, i.e. what the issues of this system are, what the users require, what the system
requires, and how this should all be put together in practical terms.
3.2 Analysis of similar systems
Systems designed using the CSLU Toolkit follow a similar format in that the initial
design is dragged onto the canvas from the object panel text-to-speech and speech
recognition in RAD are then added. Additional image maps in the media object and
list builders or databases are used to compose question/answer tests. The flow through
the program can be followed after build and run functions are applied. The program is
used in educational, corporate domains, language-training centers and in research.
Speech visualization, editing and labelling tools make this a very powerful device in
information systems and have been used to design games at undergraduate level, such
as Jeopardy and Millionaire, as well as educational information devices for students
of all ages and those suffering disabilities (i.e. visual impairment). Traditional
approaches included textbook style learning from computer screen e.g. Microsoft’s
Encarta Encyclopaedia.
3.3 System Requirements
System Requirements describe the functionality of the system in a technical way.
These requirements are written for the programmer to begin the system code.
3.3.1 Non-Functional
Non-Functional requirements are not directly concerned with the specific system
functions but add a great deal of importance to the system performance.
Accessibility – Remote access.
Interoperability – on various platforms.
Durability – must provide sustainable hardware lifetime.
Reusability – reuse and sharing of resources.
Cost effectiveness – major marketing factor.
3.3.2 Functional
Functional requirements capture the intended behaviour of the system and should
detail how the system will react to particular inputs in a technical way this is intended
for the domain of the program developer.
By using this table format the programmer can easily implement all aspects of the
requirements in a stepwise approach.
SR-0001 System access for userDescription : This Use Case describes how the system will enable the user access to the system via
passwordInitiator/Actor : Student, General User, Administrator
Pre-condition : Basic Path : 1. ID is entered by user.
2. System verifies the ID entered is a valid ID and relates this ID to the type of user accessing the system.
3. System displays appropriate home page and welcomes user.Alternative Path(s)
: 1.User enters incorrect ID number2. The system establishes there is no match for the ID entered.3. The system displays an alert
SR-0002 System user takes testDescription : This Use Case describes how the system will set up the test.
Initiator/Actor : Student, General User, Administrator.
Pre-condition : The system must previously carry out SR-0001 Password access for each userBasic Path : 1. System selects chosen subject area.
2. System checks answers.3. System responds to correct or incorrect input.
Alternative Path(s)
: 1.If no test taken2. System offers option to quit, else 3. The system offers option to return to information site.
The functional and non-functional requirements for the system were established in the
requirements elicitation and analysis stages, when the team decided on an educational
program (European Union EU-Tap) for development. Requirements were continually
refined throughout the process.
3.4 User Requirements
The intended users of this system are GCSE Students, based in either the classroom
setting or studying from home (in the case of long term illness sufferers) and visually
impaired, it is also available for general use. The unique property of this system is its
multimodal input/output design.
3.4.1 Non-Functional
Performance - Fast responses.
Usability - Ease of use.
Reliability - Need to have confidence in system.
Survivability - Perform under adverse conditions.
Expandability - Needs to be future proof.
Interoperability – should work with other systems?
Consistency – system must follow HCI guidelines.
Security – option for password protection.
Safety – should be safe to use.
Availability – should be available for use at all times.
Maintainability – should be easy maintained.
3.4.2 Functional
GCSE Student
Provide multimodal E-Learning System.
Access educational information on European Union (EU Tap).
Access information on specific countries by point and click input.
Modified to add additional subject material as required.
Use voice input as alternative to key-stroke, and for use by disabled users, and
visually impaired.
Output from system via text, audio and pictorial reference.
Provide optional testing.
Record score.
Work at preferred pace with repetition an option.
General User
All of the above functions for general interest.
The tables describe some examples of the functional requirements in natural language
easily understood by the user. By using this tabular format the user and development
team can easily track that the functionality of the desired system is included.
UR-0001 System access for userDescription : This Use Case describes how the user enters the system via password
Initiator/Actor : Student, General User
Pre-condition : Basic Path : 1. Select appropriate link from the Welcome page
2. Login Screen is displayed2. Click on stored User Name or Enter User Name and ID3. Select Login
Alternative Path(s)
: 1.Invalid Entry2. An alert will appear prompting for Revised Entry.
UR-0002 System user takes testDescription : This Use Case describes how the user takes a test
Initiator/Actor : Student, General User.
Pre-condition : User must initiate UR-0001 System access, Password Basic Path : 1. Select subject area.
2. Answer questions 3. Gets result
Alternative Path(s)
: 1. User decides not to complete test2. The user leaves the system.3. The user returns to the general information site.
Domain Requirements
Provide a natural dialogue for the user.
Consistent approach.
Consistent naming convention.
Non redundant data.
User should not have to re-enter previously accepted information.
User support.
3.5 Hardware Requirements
In order to ensure that the system is available to as wide an audience as possible, and
at the same time ensuring that the system runs as well as possible, the following
system should be taken as a minimum requirement, obviously a better system is
desirable, but due to the financial constraints of schools, the minimum requirement is
set fairly low (in comparison to present day systems):
64Mb Ram,
Video card with Open GL support (Geforce 5500 64 Mb is a
typical option)
200 MHz Processor
16-bit sound card with audio input and output (Creative
SoundBlaster being a typical option)
CRT Monitor
Keyboard and Mouse
Also required for the system would be a headset with high quality noise reduction on
the Microphone, and an internet connection for the initial set up of the CSLU toolkit.
As well as the systems hardware requirements, it is also necessary to describe
environment that would be required for the system to operate. A typically large class
room is advisable, enough to encompass the class size with one student per system,
and a main system for the lecturer, the room must be secluded from sources of
external sound sources so as not to affect the voice recognition, and also not to
distract the students. Also suggestible would be a comfortable lighting system that
isn’t too direct (so as to avoid reflections upon the screens), and not too bright so as to
dazzle the hard of sight. Comfy chairs that are fully adjustable to ensure the students
are as comfortable and as ergonomically situated as possible.
3.6 Software Requirements
The typical software requirements needed for the system are as follows:
Windows NT, Windows XP, Windows 95/98/2000 (either of
these operating systems should work well)
CSLU Toolkit Rapid Application Developer 2.0.0 (essential
for the running of the program)
Chapter 4
Design
4.1 Introduction
In this section the actual design of the system shall be broken down and explained
form both a system architecture stand-point, and also a HCI standpoint. The Human
Computer Interaction guidelines that were consulted in the course of this project shall
also be discussed, explaining the implications of each. Also shown in this section will
be an explanation of UML (Unified Modelling Language) and how it was used to aid
in the design of the system, from start to finish.
DEFINING THE PROJECT European Union (EU Tap)Category Question Your AnswerTemporary What is your project’s beginning? Start Date:
02-02-2006;Week 1 – Semester2
What is your project’s end? End Date:04-05-2006; Week 12 – Semester 2
Who will you temporarily need for your project team?
Eli BlythRosaleen Hegarty, Sonya Conlon
Unique What’s unique about your project?
Provide multimodal E-Learning system with test feature created using CSLU toolkit for the provision of educational information. Target audience: GCSE students, voice input/output for disabled users. High availability of domain knowledge. Rapid responses. Access and update of shared resources with integrity. Support/Help.
What have others accomplished that is similar to your project?
Traditional approaches included textbook style learning from computer screen.e.g. Microsoft’s Encarta Encyclopaedia.
In what manner are these projects the same as your project?
The focus is on educational information by electronic device.
Creation What aspect(s) of your project is the creation of something new?
Unique user learning experience through multimodal input/output. User has control of learning pace and repetition of session where necessary. Testing to validate knowledge acquisition an option.
In what technologies will your team converse? (e.g., CAD/CAM, Java, FoxPro, etc.)
CSLU Toolkit, C, Excel, UML – Rational Rose. Client/Server three tier system/multi-tier architecture, Internet.
Product What is your project goal? To provide an up to date and commercially viable product, that is easy to use, easily adapted, interoperable with current hardware/software and maintainable for E-Learning systems in both the classroom setting and for long-term illness sufferer’s studying from home and disabled.
Is this goal attainable? Yes. The requirements are fully understood by the software development team to deliver a validated product on time.
Does the goal need a reality check? If so, how will you accomplish this?
Many other projects are currently underway. Should the goal need a reality check, then a review with the client and presentation of the current prototype should be acceptable.
4.2 Application Architecture
CSLU Toolkit
Ease of use
speed
Graphic representation of proposed system architecture 2
2 http://www.cslu.ogi.edu/toolkit
GUI Level(Tool kit)Tutors, RAD, speech View, BaldiSync
Script Level(Tcl, ? cslush? Code)high level recognition, TTS, NLP;file processing, result evaluation
Package Level (C language, interface code) interface between Tcl and low level C code; Animated faces
C Level(C language, ? csluc? Code)low-level code for speech recognition, wave I/O, features, etc.
Festival(scheme)
4.3 Human Computer Interaction (HCI) guidelines
“Human-computer interaction is a discipline concerned with the design, evaluation and implementation
of interactive computing systems for human use and with the study of major phenomena surrounding
them.”
(Hewett et al 2004)
Although this definition is fairly accurate, it does not however represent the entirety
of the field, since there is no internationally agreed upon definition. Therefore there
are several unofficial versions used by different bodies, whichever definition is used,
the main principles are standard through out, and that is that the field is concerned
with the improvement of the interaction process between user and computer.
4.3.1 Consistency
Expectations that the user has built up using the system should not be contradicted by
another part of the same system. Consistency promotes confidence. An example;
people have become accustomed to confirming a command by pressing Enter or
Return. To deviate from this in any system would cause confusion. A good system
will guide the user through each task, as EU Tap promotes. Information, questions
and answers of all topic areas will follow a fixed format for ease of use.
4.3.2 Compatibility with User Expectations
Interactions between the end user and the system are achieved through interactive
dialogues, and point and click gestures. Users will come to expect a standard format,
requiring easy navigation, and useful interfaces instilling a command and control of
their system, promoting confidence and a useful lifetime for the system.
4.3.3 Flexibility and Control
Dialogue flexibility and control refers to how well the system can cater for or tolerate
different levels of user familiarity and performance and should allow both speech and
gesture input, whilst guiding the user in these areas.
4.3.4 Error Prevention and Correction
The supportiveness of the dialogue and the amount of assistance provided. Error
messages, system prompts and confirmation of what the system is doing fall within
this category and should help the user re-enter misunderstood and re-address missing
information.
4.3.5 Continuous and Informative Feedback
Informative feedback is consistent with instilling confidence in the system, and will
be provided through speech output by the system to help the user navigate through the
lesson.
The test will inform the user of any wrong answers they have made and provide them
with the option to continue or return to the information site for extra tutorials.
4.3.6 Visual Clarity
Colour on interactive screens will be clear and consistent. Interface options will be
placed in obvious positions and follow a continuous flow, instilling confidence in the
user, also reducing the amount of time and effort the user has to put in to navigate
through the system. Screens displayed will be easy to read.
4.3.7 Relevance of Information
Superfluous information will only lead to confusion; messages will be simple and
clear.
4.4 Unified Modelling Language
UML is the tool we have adopted from the outset to help understand the domain
problem at hand, communicate it fully to our team members and use it for reviews and
from this point evolve a working demo for the system.
The use case view of the system encompasses the use cases that describe the
behaviour of the system as seen by its end users, analysts and testers. This view
doesn’t really specify the organisation of the software system. Rather it exists to
specify the forces that shape the systems architecture. With the UML, the static
aspects of the view are captured in use case diagrams.
[Booch, G; Rumbaugh, J; Jacobson, I (1999) The Unified Modeling Language User
Guide, Addison-Wesley Harlow, England.p31-32].
These diagrams contain the following elements
Actors, which represent users of a system, including human users and other systems.
Use Cases, which represent functionality or services provided by a system to users.
[Si Alhir, S. (1998) UML IN A NUTSHELL A desktop Quick Reference, O Reilly,
Cambridge. P71]
Generalization is exemplified in the case of Student, with Student being the parent
and Classroom and Home (student) being the children. The child shares the structure
and behaviour of the parent with extra specialisations, reducing repetition and
incorporating object orientation.
We build models to visualize and control the system’s architecture.
(Booch, G; Rumbaugh, J; Jacobson, I (1999) p4).
The dynamic view of Take Test is shown using a state diagram, each state transitions
into other states.
Access system via passwordAccess EU information, Take Test
Access system via passwordAccess EU information
Test Purposes
European Union EU Tap
Student
General User
Administrator
System Users
Actual System
HIGH LEVEL USE CASE
Figure 6 High Level Use Case Diagram
Actors SpecificationStudent Students are those studying for GCSE Exams, either in the classroom
setting or based at home. Those based at home may be suffering long- term illness and wish to continue with their studies.The multimodal feature is beneficial to partially sighted users.The user can acquire information on European studies and then test their knowledge. All knowledge is current and up to date.
General User Any other user from the very young to the very wise may explore the system.
Administrator The administrators’ role is testing.
<<actor>>Back up systemDatabase, Knowledge basePrint outs
USE CASE FOR STUDENT
Student
Classroom Home
Enter via password
Choose topic to study
validates information
<<uses>>
Repeat topic, choose new topic
Take Test
Quit Lesson/Test
Database
<<uses>>
Knowledge Base
<<uses>>
Figure 7 Use Case Diagram for Student
Actors SpecificationStudent Student is a generalization of Classroom student and Home based
student. The student accesses the current European information by entering a password, chooses topic of study, can repeat the topic or take test. Then quits the system.
Classroom Student
Within the context of traditional schooling the student can use the multimodal system to acquire information. (Beneficial to partially sighted users).
Home based Student
This system allows user from home to study at their own pace for individual reasons.
Use CasesEnter password
All users enter system via password.
Choose topic Enter topic area to study.Repeat Repeat or choose new topic.Test Take test and record score.Quit Lesson Decide to end the session.
Take Test Scenario:
This state diagram of a sub level of the system, illustrates what happens when Take
Test has been selected. On opening, the initial state is Ask question. When the
question is answered the object will change state depending on whether the response
is correct or otherwise. When response is correct the counter will begin and the next
question will be asked, this will continue until the end of the test providing all
answers are correct. However should an incorrect answer be given the object state will
change to Wrong answer and offer the user the opportunity to return to the main
program for EU information or complete the test.
Start state Ask question
Correct answer
correct response
Wrong answer
Fail response
Back to Info site
End state
Take Test
Figure 8 State Diagram for Take Test
Chapter 5
Implementation
5.1 Introduction
Using the requirements that were established in the design stage implementation of
European Union EU Tap began by dragging and dropping objects onto the canvas to
create a flow of events engulfing an educational tool and multimodal learning system.
Throughout implementation, the system was continually tested ensuring that the
user’s requirements were adhered to and fully understood.
5.2 European Union EU Tap Demo
The system uses an image map to open displaying a European Union Map, which
when rolled over using the mouse high-lights areas that can be clicked on to acquire
information on given place. The system will ask which country you would like
knowledge on and from there will detail the information. The test option was installed
using List Builder object and database object. Sub dialogues are frequently used as
they keep the main screen neat and simple. Within the sub dialogues are nested
dialogues pertaining to the parent subject. The repair default has been adjusted to
allow for more attempts than the default three which when exercised results in system
offering further options other than system shut down. The option to quit the system is
available at every level, giving the user greater control and flexibility over their study
period.
5.3 Implementation Stage
The first area that was implemented was the splash screen. The opening screen has an
attractive layout, although simple, it encourages the user to want to use the program.
Figure 5 displays an image of the implemented splash screen.
Figure 9 displays splash screen for E U Tap
Figure 10, below displays the master program. This screen was implemented to control the flow of the sub screens. It is consistent in its appearance, thus providing consistent and useful navigation.
Figure 10 displays an image of the master screen.
The following diagram depicts an image of both students and the administrator logging on. This implementation took into consideration that users may enter the incorrect username and password. It was implemented to accommodate valid and invalid entry.
Figure 11 shows users logging in screen
Figure 12, below, shows the screen that provides the user with the choice to learn information on a particular country. This implementation stage also imports image maps for selected countries.
Figure 12 shows the choices offered to the student user
The following diagrams 13 and 14 display an example of two of the image maps that
are imported when the user is using the system. Figure 13 shows the image imported
if the user has selected Ireland. Figure 14 shows the image map that was
implemented to be imported if the user selects Holland.
Figure 13 shows imported image map of Ireland
Figure 14 shows imported image map of Holland
Figure 15 show images that allow users to select a subject area
The implementation screen consists of a set of images the user have been given to select a subject area using point and click. However the user also has the choice of selecting a subject area orally. Figure 15 depicts an image of the images used that allow users to select a subject area.
The following diagram, 16, shows the implementation stage of questions and answers. The example that has been illustrated below shows the options given to the user after having chosen the U.K. as the main subject area they wish to be tested on.
Figure 16 illustrates the screen image of options that have been given to the user after selecting the U.K. as their choice of country
Figure 17 below shows how the system was implemented to allow users the choice of
learning information about a country or taking a test.
Figure 17 illustrates two choices given to users, take a test or learn information about a country.
Figure 18, illustrates the implementation stage that occurred after designing a screen
asking users to input their feelings about the system
Figure 18 shows how the system was implemented to ask the user their feelings about the system
Chapter 6
Testing
6.1 Introduction
In this section, possible forms of testing shall be discussed, the format of the testing
decided upon, the Forms that were used in the testing phase of the project shall be
broken down and explained. Also discussed here will be the debugging information
that we used in the development of the system, and the conclusion drawn from the
results of the testing, and any major obstacles that were, or were not, overcome in the
development of this project.
One of the main reasons that a system is tested is to ensure that the systems does
exactly what it was designed to do, and to ensure that in the process of doing this the
system also has no unwanted effects on the system itself, or the computer it was
designed to run on.
6.2 White, Black and Grey Box Testing
The following sections describe the three main approaches to testing, and the reasons
behind the group’s choice to use Grey Box Testing.
6.2.1 White Box Testing
“White box testers have access to the code, but even a black box tester can know the branches of code--the rules within the code that cause operations to fork. A white box tester generally uses the
code, and the ability to create drivers and stubs to test the code directly. They do not rely on the UI to do it”
(http://www.sqatester.com/methodology/WhatisGrayBoxTesting.htm 2000)
Essentially, white box testing is a method of testing whereby the tester understands
the entire running of the system and so knows the routes that information takes. This
form of testing is labour intensive since it must test every single path through the
system, and go into detail on each path through the system. This form of testing is
good for individual components, but is less applicable to entire systems, especially if
there are multiple paths through the system, since each of these paths have to be tested
and understood by the tester, in order for the tester to be able to fully test the system.
This form of testing is usually employed by the programmer of the components of the
system, since they know the component inside and out, and so a tester doesn’t need to
be trained in the entire functioning of the system. White box testing has also been
referred to as Glass Box Testing or Structural Testing by some texts, due to the fact
that the structure of the program is being tested as well as the behaviour.
6.2.2 Black Box Testing
“True black box testers look only at the GUI and can not touch intermediate files, registry entries,
databases, etc., nor are they permitted to see the results their actions have wrought, other than through
the UI. They are, therefore, only permitted to use the UI to do their testing”
(http://www.sqatester.com/methodology/WhatisGrayBoxTesting.htm 2000)
Black box testing is the complete opposite of the white box testing in that the tester
doesn’t see any of the underlying system, rather the tester simply tests that the
corresponding outputs link up with the inputs by the tester, how this is done isn’t
important simply that the system works from a users perspective is all that is being
tested. This form of testing looks at the overall functionality of the system, and
doesn’t concern itself with the paths through the system. This form of testing is also
referred to in some texts as Behavioural Testing, since the behaviour of the system is
being tested as opposed to the Structure.
6.2.3 Grey Box Testing
“The typical grey box tester is permitted to set up his testing environment, like seeding a database, and can view the state of the product after their actions, like performing a SQL query on the database to be certain of the values of columns. It is used almost exclusively of client-server testers or others who use
a database as a repository of information, but can also apply to a tester who has to manipulate XML files (DTD or an actual XML file) or configuration files directly”
(http://www.sqatester.com/methodology/WhatisGrayBoxTesting.htm 2000)
Grey box testing is a combination of white and black, in that a basic understanding of
the system is needed before testing can commence. The tester then proceeds to test
the system as in black box testing, but then looks closer at components should errors
or bugs arise, where as in black box testing the tester simply would inform the
programmers that there was a problem and the conditions under which it appeared.
Grey box testing enables much more detailed testing and results as with white box
testing, but not every path through the system is tested, only paths that present
problems are looked at more closely. This gives the tester a more detailed description
of the problems than with black box testing, but allows a far greater coverage than
with purely white box testing.
6.3 Format
The format of tested decided upon for this system is that of Grey Box Testing, this is
due to the systems complexity. With the Grey Box Testing method more coverage of
the system is allowed and each member of the team gets a chance to test the system,
therefore allowing the programmers, the designers, and the requirements tracking
team to check the system against the functional and non-functional requirements.
In this project it was decided upon to employ a bug tracking system to help speed up
the process of generating the system, allowing the team to cover major bugs and
errors first, leaving trivial errors until later, resulting in a working prototype much
quicker than had the team attempted to fix errors as they were identified.
The format of the bug tracking system was designed in such a way that the bug or
error could be classified according to the severity, and then by the nature and location
of the problem. For instance, a spelling mistake would be classified as a minor or
trivial error, where as the program exiting or being unable to continue would have
been classified as a major or blocker. The decision to include a present state of the
bug in the tracking forms was also taken, so that the programmers didn’t find
themselves attempting to fix errors and bugs that had already been fixed.
The design of the tracking form is shown in the Forms section below.
The format of the testing used also required that we had a testing plan, one which
would direct the testers to ensure that the same sections were not tested over and over
again by different people, especially once a section is shown to be working as
expected. This test plan simply timetabled testing on each section with a time frame
in which repeat testing of area’s that have changed. The team felt that this simple test
plan would suffice since the system being proposed was relatively small.
6.4 Forms
The following is the design of the bug tracking forms used by the team during the
testing phase of the prototypes design and development (in green are the descriptions
of the field):
Brief Description:Included here is a brief one line description of the bug found
Severity level (1 being Blocker, 5 being Trivial). The level of severity of the bug1 2 3 4 5State: The current state of the bug
PENDING: NOW FIXED: NOT FIXABLE: IGNORINGLocation of Error:
This section includes the detailed position of the bug or errorDetailed Description:
In this section the detailed description of the problem is described, including if necessary code or layout.
Found by:Who found the bug.
Date Found:The date the bug was found
Nature of Fix:
The nature of the fix implemented, this is a detailed description including the code change, location change etc.
Fixed by:The person who fixed the bug
Date Fixed:Date it was fixed upon
Figure 4: Bug Tracking System or BTS used in this project
6.5 Test cases
“Use Cases are used to specify the required functionality of an Object-Oriented system. Test cases that are derived from use cases take advantage of the existing
specification to ensure good functional test coverage of the system”(Wood et al. 1999)
The above quote shows the importance of taking into account the use cases when
deciding upon the best test cases, the research on this area shows that correct use of
the existing use cases can simplify the process of creating good enough test cases to
properly test the system being developed. For this reason, the authors of this paper
decided to base our test cases upon the use cases, including in situations where the
user may enter in information that is incorrect.
With test cases, there are two important types; there are Positive path and Negative
path:
Positive Path Test Cases – These are cases where the expected inputs are placed in at
the expected times, the result of these should be in keeping with the use cases outputs.
If the outputs do not match the up with what is expected from the inputs, then the case
fails, and a Bug should be raised by using the form described earlier.
Negative Path Test Cases – These are where intentionally incorrect inputs are passed
to the program in an attempt to disrupt the program, or crash it, should the program
not deal with these erroneous inputs; then a bug should be raised using the form
described earlier. The one of the only ways this type of test can pass is when the
erroneous inputs are passed to the system, and then they are dealt with appropriately,
either resulting in an error reported to the user informing them that the input was
incorrect, or the user is re-asked to enter in the information.
“Parallel to the software development effort, the software test team can take advantage of the Use Case format by deriving Test Cases and Test Scenarios from
them”(Wood et al. 1999)
The following diagram (Figure 4 curtsey Wood et al. 1999) shows how the above
quote is done:
Build ConstructionRequirement VerificationSCR Work-off
Use Case Development
Object Model Creation
Scenario Creation
Interaction Diagrams
Software Requirements
Test Case Development
Test Scenarios
Class Design
Design Level Object Model
Component Coding & Unit Testing
System Integration
Test Development
Figure 19 Use Cases are used to develop the static and dynamic object models.
6.6 Debugging
After using the above techniques to find any errors, the debugging process began in
order to find out where the error occurred and to isolate them. The code was then
examined. A subsequent solution was developed. The newly modified code was
inserted and the program was run again to check that the bug was fixed.
6.7 Conclusion
Testing identifies problems within the system from every user perspective. Often it is
difficult to spot one’s own mistakes and it takes a fresh eye to assess the program for
bugs. Verifying and validating the system confirms that we have created the system
the user wants and that we have built the right product.
Chapter 7
Conclusion
7.1 Introduction
In this section the results of the paper shall be presented along with the critical
analysis of the paper, the software used, the interface, and the overall system
developed as part of this paper. Also discussed here will be the future developments
that could be applied to this paper to make it a far better solution, and any
developments that we had originally thought of, but were unable to implement within
the time frame.
7.2 Results
Multimodal systems are a welcome tool to education and will be around for the future
with many additional intelligent agents added as testing further high-lights the
empowering impact these devices have on a group or individual. Studies in
experimental psychology are using these approaches to understand the brain activities
in autism and many early years’ development issues. We have achieved our goal to
design and implement a usable educational tool for geography students studying for
GCSE exams, with built in test function.
7.3 Critical Analysis
The concept of a multimodal educational system has immense potential, as earlier
discussed those with long term illness need not get left behind because of classroom
absence and even within the classroom setting this system allows individuals to work
at their own pace and to repeat any areas necessary until they feel satisfaction with
their learning acquisition. Too often in a classroom a message is relayed only once,
those who are absent or need extra time may never come to grips with that piece of
knowledge and slowly but surely slip further away from the need to be educated. This
should not be the case and by enabling computers to interact quality knowledge with
children we will be enriching their lives for all our futures.
7.4 Future Developments
Having run the implemented system, suggestions have been put forward for instance
only specified answers can be accepted by the system, if the knowledge base was
more extensive then the user would not be so restricted in their replies. A particular
user suggested the use of video attached to topics of interest so the audio is
accompanied by the visual constantly changing and maintaining interest. Users of the
system loved the roll over effect with the mouse high lighting countries of interest and
asked for more click on features like this in future systems. More development is
necessary in the emotive side of learning and exploration of this area would be hugely
interesting. There is a change of focus with multimodal educational systems from that
of ‘being taught’ to ‘learning’. Digital technologies encourage the idea of independent
learning at an individual’s pace without the pressure of peers.
ReferencesAlan M. MacEachren, Guoray Cai, Rajeev Sharma, Ingmar Rauschert, Isaac Brewer, Levent Bolelli, Benyah Shaparenko, Sven Fuhrmann, Hongmei Wang (2005), Enabling Collaborative Geoinformation Access and Decision-Making through a Natural, Multimodal Interface, GeoVISTA Centre, Penn State University, Department of Geography, 302 Walker, Penn State University, Department of Computer Science and Engineering, Penn State University, School of Information Sciences and Technology, Penn State University.
Latoschik, M. E. (2005), a User Interface Framework for Multimodal VR Interactions, AI & VR Lab, University of Bielefeld.
Aleksander, I. (1987), Designing Intelligent Systems: An Introduction, Billing and Sounds Ltd.
Jurafsky, D., Martin, H. Martin (2003), Speech and Language Processing, Prentice Hall.
Nilsson, N. J. (1998), Artificial Intelligence and Synthesis, Norgan Kauffman Publishers Inc.
Wittgenstein, L. (1981), Tractatus Logico Philosophicus, Routledge.
Wittgenstein, L., Heaton J., and Groves J. (1995), Wittgenstein for Beginners, Icon Books Ltd.
Wittgenstein, Ludwig, Translated by G.E.M. Anscombe (1997), Philosophical Investigations, Blackwell Publishers Ltd.
Sowa, J. F. (1991), Principles of semantic networks: Exploration in the Representation of Knowledge, Morgan Kaufmann: Los Angeles, CA.
Ingleton, C. (1999), Emotion in learning: a neglected dynamic, Advisory Centre for University Education, University of Adelaide, HERDSA Annual International Conference, Melbourne, 12-15 July 1999
Tyler, B., and Soundarajan, N. (2003), Black-box testing of Grey-box Behaviour, Computer and Information Science Ohio State University http://www.cse.ohio-state.edu/~neelam/papers/fates03.pdf
Walsh, P., and Meade, J. (2003), Speech Enabled E-Learning for Adult Literacy Tutoring, Cork Institute of Technology, Proceedings of the “The 3rd IEEE International Conference on Advanced Learning Technologies” (ICALT’03)
Barry Alan Po, 2005, a Representational Basis for Human-Computer Interaction, THE UNIVERSITY OF BRITISH COLUMBIA, April 18, 2005
Deb Roy and Niloy Mukherjee, 2005, Towards Situated Speech Understanding: Visual Context Priming of Language Models, Cognitive Machines Group, the Media Laboratory, Massachusetts Institute of Technology
Wood, D., and Reis, J. (1999), Use Case Derived Test Cases, Harris Corporation, STAREAST '99, http://www.stickyminds.com/
AppendicesBug Tracking Form for the EU Tap System
Brief Description:
Database not connectedSeverity level (1 being Serious, 5 being Trivial)
1 2 3 4 5
State:PENDING: NOW FIXED: NOT FIXABLE: IGNORING
Location of Error:
Main system, connection of ODBC, lower levels also affectedDetailed Description:
I have been unable to connect the database up to enable the selection of country specific information in all aspects of the system resulting in an inability to test and read information. I have a workaround in place in that a list is being used for the time being, however this means a very complicated system structure, and could lead to future errors if not rectified.Found by:
RosaleenDate Found:
29/03/2006Nature of Fix:
Added appropriate dll file into the CSLU Toolkit based on discussion with Intelligent Multimedia demonstrators.Fixed by:
Rosaleen, Sonya, Eli and assistance from Intelligent MultimediaDate Fixed:
07/04/2006
Transcripts of Test Runs of Working System
Screen 1 shows splash screen
Screen 2 shows user being asked to log on. They will be asked to input password,
‘x360’
Screen 3 shows user being asked to select subject area via mouse input. They also
have the option to select the subject area orally.
Screen 4 shows user being asked to choose if they would like to learn, partake in a test
or quit the program. Users also have the option of choosing between these choices
orally. Users can also select the country by rolling the mouse over the screen and
selecting the country of interest.
Screen 9 shows the last page of the application bids good-bye to the user and
reiterates the word fun.
Program scripts# queryone_60set x0 [expr -120.0 + $offsetX] set y0 [expr 510.0 + $offsetY] set obvar [newO queryone $x0 $y0 {no 6}]set r(queryone_60) $obvarupvar #0 $obvar obset ob(gif_original) {S:/CSLULAB/MG122/Toolkit/2.0/apps/rad/base/gif/generic.gif}set ob(recogType) {Tree}set ob(override:recognizer) {0}set ob(changetrigger) {5}set ob(dtmf,mode) {off}set ob(prompt,type) {tts}set ob(gif_tmmods) {S:/CSLULAB/MG122/Toolkit/2.0/apps/rad/packages/Tucker-Maxon/gif_alt/generic.gif}set ob(override:sdet) {0}set ob(override:vumeter) {0}set ob(prompt,markupText) {<SABLE></SABLE>}
set ob(recogportType,0) {Words}set ob(recogportType,1) {Words}set ob(recogportType,2) {Words}set ob(override:repair) {0}set ob(override:tts) {0}set ob(recogportType,3) {Words}set ob(prompt,ttsText) {$user please select from one of the following area's ,Economics,People,Geographical,General,If you would like to select by mouse say, Mouse, or if you wish to quit the program say, Quit, now}set ob(repairStatus) {default}set ob(recogportType,4) {Words}set ob(recogportType,5) {Words}set ob(changerate) {5}set ob(prompt) {$user please select from one of the following area's ,Economics,People,Geographical,General,If you would like to select by mouse say, Mouse, or if you wish to quit the program say, Quit, now}set ob(dynamicWords) {{Mouse {m aU s}} {mouse_entry {m aU s [.pau] E n tc th 9r i:}} {by_mouse {bc b aI [.pau] m aU s}} {click {kc kh l I kc kh}} {*any .any} {Economy {I kc kh A n ^ m i:}} {Economics {E kc kh ^ n A m I kc kh s}} {Economy_Information {I kc kh A n ^ m i: [.pau] I n f 3r m ei S ^ n}} {People {pc ph i: pc ph ^ l}} {People_Information {pc ph i: pc ph ^ l [.pau] I n f 3r m ei S ^ n}} {Geographic {dZc dZ i: ^ gc g
9r @ f I kc kh}} {Geographical {dZc dZ i: ^ gc g 9r @ f I kc kh ^ l}} {Geography {dZc dZ i: A gc g 9r ^ f i:}} {Geo {dZc dZ i: oU}} {General {dZc dZ E n 3r ^ l}} {General_Information {dZc dZ E n 3r ^ l [.pau] I n f 3r m ei S ^ n}} {Info {I n f oU}} {Information {I n f 3r m ei S ^ n}} {Quit {kc kh w I tc th}} {Stop {s tc th A pc ph}} {Go_Away {gc g oU [.pau] ^ w ei}}}set ob(dyn:recog) {0}set ob(prompt,recordFlag) {0}set ob(bargein) {off}set ob(portType,0) {Undefined}set ob(portType,1) {Undefined}set ob(package) {Base}set ob(portType,2) {Undefined}set ob(portType,3) {Undefined}set ob(portType,4) {Undefined}set ob(portType,5) {Undefined}set ob(override:caption) {0}set ob(name) {InfoSelect}set ob(dtmf,interrupt) {0}set ob(words) {{{Mouse {mouse entry} {by mouse} click *any} {} {{{m aU s}} {{m aU s [.pau] E n tc th 9r i:}} {{bc b aI [.pau] m aU s}} {{kc kh l I kc kh}} .any}} {{Economy Economics {Economy Information}} {} {{{I kc kh A n ^ m i:}} {{E kc kh ^ n A m I kc kh s}} {{I kc kh A n ^ m i: [.pau] I n f 3r m ei S ^ n}}}} {{People {People Information}} {} {{{pc ph i: pc ph ^ l}} {{pc ph i: pc ph ^ l [.pau] I n f 3r m ei S ^ n}}}} {{Geographic Geographical Geography Geo} {} {{{dZc dZ i: ^ gc g 9r @ f I kc kh}} {{dZc dZ i: ^ gc g 9r @ f I kc kh ^ l}} {{dZc dZ i: A gc g 9r ^ f i:}} {{dZc dZ i: oU}}}} {{General {General Information} Info Information} {} {{{dZc dZ E n 3r ^ l}} {{dZc dZ E n 3r ^ l [.pau] I n f 3r m ei S ^ n}} {{I n f oU}} {{I n f 3r m ei S ^ n}}}} {{Quit Stop {Go Away}} {} {{{kc kh w I tc th}} {{s tc th A pc ph}} {{gc g oU [.pau] ^ w ei}}}}}set ob(grammar) {{{} {}} {{} {}} {{} {}} {{} {}} {{} {}} {{} {}}}set ob(recognizer) {name adult_english_16khz_0.ob infoDial American infoRate 16000}
# subnet_61set x0 [expr 20.0 + $offsetX] set y0 [expr 720.0 + $offsetY] set obvar [newO subnet $x0 $y0 {no 1}]set r(subnet_61) $obvarupvar #0 $obvar obset ob(recogType) {Tree}set ob(override:recognizer) {0}set ob(dtmf,mode) {off}set ob(prompt,type) {tts}set ob(override:sdet) {0}set ob(override:vumeter) {0}set ob(override:tts) {0}set ob(prompt,recordFlag) {0}set ob(bargein) {off}set ob(package) {Base}
set ob(override:caption) {0}set ob(name) {Feelings}set ob(dtmf,interrupt) {0}set ob(words) {{{} {} {}}}set ob(grammar) {{}}
# subnet_62set x0 [expr -120.0 + $offsetX] set y0 [expr 410.0 + $offsetY] set obvar [newO subnet $x0 $y0 {no 1}]set r(subnet_62) $obvarupvar #0 $obvar obset ob(recogType) {Tree}set ob(override:recognizer) {0}set ob(dtmf,mode) {off}set ob(prompt,type) {tts}set ob(override:sdet) {0}set ob(override:vumeter) {0}set ob(override:tts) {0}set ob(prompt,recordFlag) {0}set ob(bargein) {off}set ob(package) {Base}set ob(override:caption) {0}set ob(name) {Welcome}set ob(dtmf,interrupt) {0}set ob(words) {{{} {} {}}}set ob(grammar) {{}}
####### CONNECTIONSconnect r queryone_60 media_56 2 -114.0 588.0 -174.0 642.0 -234.0 705.0 $offsetX $offsetYconnect r media_58 subnet_53 0 -44.0 798.0 -44.0 797.0 -44.0 805.0 $offsetX $offsetYconnect r media_54 media_57 2 -234.0 648.0 -189.0 672.0 -144.0 705.0 $offsetX $offsetYconnect r media_54 media_55 0 -274.0 648.0 -299.0 672.0 -324.0 705.0 $offsetX $offsetYconnect r media_59 goodbye_49 0 36.0 898.0 -39.0 902.0 -114.0 915.0 $offsetX $offsetYconnect r media_54 media_58 3 -214.0 648.0 -129.0 672.0 -44.0 705.0 $offsetX $offsetYconnect r queryone_60 media_58 4 -74.0 588.0 -59.0 642.0 -44.0 705.0 $offsetX $offsetYconnect r input_48 subnet_62 0 -104.0 388.0 -104.0 387.0 -104.0 395.0 $offsetX $offsetYconnect r subnet_52 queryone_60 0 -194.0 898.0 -169.0 692.0 -144.0 495.0 $offsetX $offsetYconnect r media_56 subnet_52 0 -234.0 798.0 -234.0
$offsetX $offsetYconnect r queryone_60 media_57 3 -94.0 588.0 -119.0 642.0 -144.0 705.0 $offsetX $offsetYconnect r subnet_50 queryone_60 0 -284.0 898.0 -214.0 692.0 -144.0 495.0 $offsetX $offsetYconnect r media_54 subnet_61 4 -194.0 648.0 -79.0 672.0 36.0 705.0 $offsetX $offsetYconnect r media_55 subnet_50 0 -324.0 798.0 -324.0 797.0 -324.0 805.0 $offsetX $offsetY
connect r subnet_62 queryone_60 0 -104.0 488.0 -104.0 487.0 -104.0 495.0 $offsetX $offsetYconnect r subnet_61 media_59 0 36.0 798.0 36.0 797.0 36.0 805.0 $offsetX $offsetYconnect r media_57 subnet_51 0 -144.0 798.0 -144.0 797.0 -144.0 805.0 $offsetX $offsetYconnect r subnet_53 queryone_60 0 -84.0 898.0 -74.0 692.0 -64.0 495.0 $offsetX $offsetYconnect r queryone_60 subnet_61 5 -54.0 588.0 -9.0 642.0 36.0 705.0 $offsetX $offsetYconnect r queryone_60 media_55 1 -134.0 588.0 -229.0 642.0 -324.0 705.0 $offsetX $offsetYconnect r queryone_60 media_54 0 -154.0 588.0 -194.0 567.0 -234.0 555.0 $offsetX $offsetYconnect r subnet_51 queryone_60 0 -184.0 898.0 -184.0 495.0 -104.0 495.0 $offsetX $offsetYconnect r media_54 media_56 1 -254.0 648.0 -244.0 672.0 -234.0 705.0 $offsetX $offsetY
##### SUBDIALOGUEset offsetX 0set offsetY 0set id [registerScreen "Economic"]lappend newScreens subnet_50 $idrecordActiveScreen $id
# enter_86set x0 [expr -150 + $offsetX] set y0 [expr -60 + $offsetY] set obvar [newO enter $x0 $y0 {no 1}]set r(enter_86) $obvarupvar #0 $obvar obset ob(recogType) {Tree}set ob(override:recognizer) {0}set ob(dtmf,mode) {off}set ob(prompt,type) {tts}set ob(override:sdet) {0}set ob(override:vumeter) {0}set ob(override:tts) {0}set ob(prompt,recordFlag) {0}set ob(bargein) {off}set ob(package) {Base}set ob(override:caption) {0}set ob(name) {enter}set ob(dtmf,interrupt) {0}set ob(words) {{{} {} {}}}set ob(grammar) {{}}
set ob(override:repair) {0}set ob(override:tts) {0}set ob(prompt,ttsText) {This is a Demo of the full application, for the full version, please contact the makers at ,[email protected] ,the only countries available for selection are,the UK, Denmark, and Germany}
set ob(changerate) {5}set ob(prompt) {This is a Demo of the full application, for the full version, please contact the makers at ,[email protected] ,