Upload
fgv
View
0
Download
0
Embed Size (px)
Citation preview
DO COGNITIVE STYLES OF LEARNING AFFECT STUDENT PERFORMANCE IN
COMPUTER-ADAPTIVE TESTING?
MARIANA LILLEY
TREVOR BARKER
MARTA DE CAMPOS MAIA
SUMMARY
1. INTRODUCTION
2. COGNITIVE STYLE
3. COMPUTER-BASED AND COMPUTER-
ADAPTIVE TESTING
4. ITEM RESPONSE THEORY
5. PROTOTYPE DEVELOPMENT AND
EVALUATION
6. THE STUDY
7. RESULTS
8. DISCUSSION AND FUTURE WORK
9. REFERENCES
1. INTRODUCTION
This paper marks a further progression on
research previously done by Fundação Getúlio
Vargas on distance learning and University of
Hertfordshire on the use of computer-adaptive
tests in Higher Education (Lilley et al., 2002b).
Previous work by the authors discussed how the
number of distance-learning students in Brazil
has been significantly increasing since 1997
(Meirelles and Maia, 2001), and how this
growth has led to an increased interest from
both teaching staff and educational researchers
as to the full exploration of the digital path
provided by the Internet (Lilley et al., 2002b).
This study focused on computer-delivered
assessments, and its main purpose was to
compare two types of computer-delivered
assessments: computer-based tests and
computer-adaptive tests.
Whilst in a traditional computer-based test the
questions presented during a given assessment
session are not tailored for the specific ability
of an individual student, in a computer-adaptive
test the questions are selected dynamically for
each student, based on how well the student had
answered all previous questions administered
during that session of assessment. The findings
from this preliminary study suggested that the
main benefits of a computer-adaptive test over
a traditional computer-based test in the context
of Production and Operations Management
Education ranged from higher levels of
interaction and motivation amongst students to
the opportunity to bring academic assessment
practices in line with the practices of leading
organisations, such as the Graduate
Management Admission Test (GMAT) and
Microsoft Certified Professional (MCP).
The popularisation of computer-delivered assessments within Higher Education is typically associated
with the potential benefits offered by this approach, such as widening of assessment methods, ability to
mark large numbers of assessments without additional workload to the lecturers and opportunity to
provide students with immediate feedback on their performance (Conole and Bull, 2002; Jolliffe et al.,
2001; Harvey and Mogey, 1999).
At present most computer-delivered assessments are based on text-only, serially delivered objective
questions (e.g. multiple-choice question). Thus it was important to ensure that such assessments are a
fair measure of ability and that individual learners are not disadvantaged by this method, regardless
their cognitive style. Although Barker and Barker (2002) did not find any differences between
Verbalisers and Imagers (Riding, 1991a) in more traditionally delivered assessments, it is possible that
the text-based computer-delivered assessments might disadvantage some students along the VI
dimension.
The next section of this paper provides a brief introduction to cognitive styles of learning, as proposed
by Riding (1991a). In addition, we describe the development of a computer-adaptive test (CAT) and
compare it to traditional computer-based tests (CBTs). We also investigate whether or not cognitive
styles of learning have the potential to be an important factor influencing student performance when
participating in a computer-adaptive test. To this end, 133 students were tested using the Cognitive
Style Analysis (CSA) software. Their cognitive styles were then compared with their performance in a
computer-assisted assessment, which comprised both non-adaptive (i.e. CBT) and adaptive (i.e. CAT)
elements. In the final section of this paper, we discuss how the findings from this study can be applied
within the domain of Business Administration distance learning.
2. COGNITIVE STYLE
Riding (1991a) has described two bipolar dimensions of cognitive style, which are Wholist-Analytic
(WA) and Verbaliser-Imager (VI). Both dimensions are illustrated in Figure 1.
Figure 1: The two Cognitive Style dimensions
A simple computer-based application, namely Cognitive Style Analysis (CSA), has been developed by
Riding (1991b) in order to classify learners according to their position along the WA and VI
dimensions. The two styles or dimensions are independent of each other, so that the position of an
individual on one scale does not influence their position on the other.
Riding’s WA dimension describes whether individuals process information in wholes or in parts.
Wholists tend to see the whole of a picture and appreciate the whole of the context in one go.
Notwithstanding this ability, they have difficulty in appreciating the parts. Analytics are capable of
seeing the details of a situation, but may have difficulty in appreciating how the different parts relate to
each other. Figures 2 and 3 respectively illustrate how a scenario or information might be handled by
an Analytic and a Wholist individual (Riding, 1996).
Figure 2: Analytic View Figure 2: Wholist View
Riding’s VI dimension classifies whether individuals represent information during thinking in words or
pictures. Verbalisers tend to organize thinking as associations of words whereas Imagers tend to think
and handle mental information in the form of pictures. Riding suggests that this is likely to influence
the mode of presentation they prefer and the type of tasks they find easy. Riding also relates the focus
of mental activity as internal for Imagers and external for Verbalisers (Riding, 1991a). Individuals
whose focus is external tend to be more socially aware and favour stimulating environments.
Conversely, individuals whose focus is internal have a tendency to be less socially aware, favouring
less dynamic environments.
Riding and others have provided extensive research to relate the WAVI cognitive styles to learning
processes. For example, Riding and Sadler-Smith (1992) were able to show that the effectiveness of a
learning package presented in text or image format could be related to learning style. There was a
significant interaction between the application presentation style and learning style when pre-test and
post-test scores were compared (Riding and Sadler-Smith, 1992). Riding and Watts (1997) studied the
effect of cognitive style on the preferred formats of learning materials. They found that Imagers were
more likely to select picture-based presentations than text based ones. Verbalisers were more likely to
choose text-based presentations (Riding and Watts, 1997). Further work from Douglas and Riding
(1993) suggests that performance in different presentation modes closely matches learning style. For
example, Imagers performed better than Verbalisers in text plus picture presentations. In contrast,
Verbalisers performed better than Imagers when text was presented without pictures.
In the study introduced here, each student’s cognitive style was determined along the WA and VI
dimensions prior to taking the course, by means of the computer-delivered CSA test developed by
Riding (1991b). This test is simple and requires users to distinguish between true and false statements
about various objects and to locate and distinguish between information presented graphically. Based
on the test, which takes approximately 10 minutes to complete, users were assigned a score along
Riding’s WA and VI dimensions.
An outline of two types of computer-delivered assessments, namely computer-based test (CBT) and
computer-adaptive test (CAT) is presented next.
3. COMPUTER-BASED AND COMPUTER-ADAPTIVE TESTING
Typical CBTs are based on the use of objective questions. These objective questions are devised in
such a way that their marking does not rely on any subjective judgment on the part of the marker. A
further characteristic of CBTs would be that all students are subjected to the same set of questions
during the test. In well-designed CBTs, the questions administered present different levels of
difficulty, ranging from easier questions requiring only the recall of information to more difficult
questions in which higher intellectual skills such as analysis are needed (Pritchett, 1999).
One of the limitations of the CBT approach is the fact that students who are more able may feel
unmotivated to answer questions that are too easy, as these do not present an appropriate level of
challenge to their current ability. In a similar way, questions that are too difficult may lead less able
students to answer these questions by guessing rather than reasoning.
In earlier work by Lilley et al. (2002a) it was identified that CBTs are typically constructed on the
premise that a broad range of abilities should be assessed. To this end, the test contains questions of
several difficulty levels to match all these different levels of ability. In so doing, more able students are
required to answer questions that are below their level of ability before they are presented with
questions that are challenging. In the same vein, less able students start answering questions that are
likely to be appropriate for their level of ability and then, at a later stage, are presented with questions
that can be deemed too difficult and hence frustrating and even intimidating. In both cases, questions
that are not appropriate for the level of ability of a particular student provide little or no valuable
information about this student.
Lilley and colleagues (2002b) suggest that an alternative to conventional CBTs is the use of CATs.
The underlying principle within CATs is that questions that offer an appropriate level of challenge for
an individual student provide more information about this student than questions that are either too
difficult or unchallenging. In the next section of this paper we describe how Item Response Theory is
applied within the CAT application introduced here.
4. ITEM RESPONSE THEORY
Computer-adaptive tests are based on Item Response Theory (IRT), which consists of a family of
mathematical models that can be used in the design and implementation of educational tests. The
model we selected to use in our application is the Three-Parameter Logistic Model (3-PL) and its
mathematical expression is shown in Equation 1 (Lord, 1980):
Equation 1: The Three-Parameter Logistic Model
)(7.11
1)(
bae
ccP
In Equation 1, represents a student’s ability. As the model’s name implies, the 3-PL Model
comprises three parameters: (i) b, which represents the level of difficulty of the question; (ii) a, which
corresponds to the question’s discrimination and hence determines the usefulness of a question when
distinguishing among students near an ability level (Hambleton, 1991) and (iii) c, which denotes the
probability of a student answering the question correctly by chance. Finally, P( ) corresponds to the
probability P of a student with ability answering a question of difficulty b correctly.
In a CAT, the questions to be administered during a given session of assessment are based on the
student’s performance on previous questions. In general terms, when a correct response is given, the
adaptive algorithm statistically estimates the student’s ability as higher than previously estimated, and
then presents a more difficult question that closely matches that new higher estimate of the student's
ability. Conversely, if the response provided is incorrect, the adaptive algorithm estimates the student’s
ability as being lower and then presents an easier question that matches that new lower estimate.
The process of administering questions, evaluating responses and selecting the next question to be
presented is repeated until a stop condition has been met. In fixed-length CATs, the most common stop
conditions are duration (i.e. a certain time limit has been reached) or number of questions answered. In
variable-length CATs, the test usually stops when a certain standard error for the estimate of the
student’s ability has been met.
The following section presents the main milestones regarding the development and evaluation of the
CAT prototype introduced here.
5. PROTOTYPE DEVELOPMENT AND EVALUATION
The adaptive algorithm used in the prototype we developed was based on the 3-PL Model from IRT.
This model was chosen over other models for dichotomously scored questions, such as the One and
Two Parameter Logistic models, as it takes into consideration the possibility of a student giving a
correct response by guessing (Lord, 1980).
In order to verify the feasibility of the CAT approach in a Higher Education context, we designed and
constructed a high-fidelity prototype for initial testing of our ideas. This prototype consisted of a
Graphical User Interface and a 250 objective questions database in the domain of English as second
language.
This prototype was subjected to a heuristic evaluation involving 11 experts in the subject domain and in
Computer Science. In this evaluation, the experts used a set of guidelines to evaluate the prototype.
The fact that no major usability problems were found was taken to indicate that the prototype was
usable and that the interface imposed no significant barriers to assessment. The panel of experts also
evaluated the prototype’s effectiveness as a computer-delivered assessment tool in an academic
context. The findings from this evaluation are described in full by Lilley and Barker (2002).
It was found that the qualities of speed and accuracy of marking plus the opportunity to provide the
students with immediate feedback typically associated with a CBT remained unchanged when experts
were using the CAT. The CAT approach was reported as being potentially capable of offering higher
levels of individualisation than offered by a traditional CBT. This was due to the fact that the
application was able to select the most appropriate questions to be presented to each individual student
based upon their ability, as dynamically measured during the test. In addition to expert evaluation, a
range of trials was conducted with users. These involved studies of user performance on tests, online
questionnaires and structured focus groups (Lilley and Barker, 2002; Lilley et al., 2002a). No major
usability or other problems were identified in these studies.
In the following section we summarise how the study was carried out.
6. THE STUDY
A group of 133 students enrolled in a second year Visual Basic programming course at the University
of Hertfordshire participated in assessments using the CAT application we developed. The application
used for the assessments was an enhanced version of the high-fidelity prototype described in the
previous section. In order to facilitate better formatting of questions, the prototype’s interface was
modified to support new graphical elements. In addition, a new database containing 119 questions
covering the module syllabus was created and calibrated by subject experts. A non-adaptive element
(i.e. CBT) was added to the application. In so doing, we expected not only to gather valuable
information for comparative purposes but also to guarantee that the test would be fair to all
participants. Figure 4 illustrates how the application was structured.
Figure 4: Flowchart illustrating how the application was structured
Participants took the tests as part of their regular assessment in a second year Higher National Diploma
course in Computer Science. Both sessions of assessment were conducted under supervised conditions
in computer laboratories. Immediately prior to the first assessment, all participants took Riding’s CSA
test under supervised conditions. A trained administrator gave each participant a short introduction to
the test as recommended by Riding (1991a).
As for assessment 1, this took place in week 7, and comprised 10 non-adaptive questions (i.e. CBT
mode) followed by 10 adaptive ones (i.e. CAT mode). Assessment 2 took place during week 10 and
comprised 10 CBT questions followed by 20 CAT ones. Prior to the first session of assessment using
the software, students were given a brief introduction to the use of the software, but were not informed
of the existence of two sections within the test (i.e. CBT followed by CAT). In both sessions of
assessment, the order in which the CBT questions were presented was randomly selected, as an attempt
to minimise unauthorised collaboration amongst students.
The stop condition in the application described here is a combination of duration and number of
questions responded. The test is ended either when a certain time limit has been reached or
immediately after a certain number of questions have been administered, whichever occurs first. In the
application introduced here, when the stop condition has been met, the test is ended and the last
estimated level of ability is taken to be the student’s ability for that given session of assessment. The
level of ability ranged from –2 (lowest) to +2 (highest), and from now on this value will be referred
as to “Level CAT”.
The results obtained by the students in assessments 1 and 2 are presented and analysed in the following
section of this paper.
7. RESULTS
A summary of student performance in both assessments is presented in Table 1. In Tables 1 and 2, the
columns “%CBT” and “%CAT” represent the percentage of correct responses in that part of the
assessment.
Table 1: Mean scores obtained by students in assessments 1 and 2 (N=133)
Assessment 1 Assessment 2
%CBT1 %CAT1 Level
CAT1
%CBT2 %CAT2 Level
CAT2
51.5 59.9 -0.83 42.3 53.0 -0.91
A Pearson’s Product Moment correlation was performed on the scores summarised in Table 1. The
results of this correlation are shown in Table 2.
The results obtained from the statistical analysis shown in Table 2 show that there is a good correlation
between the CAT score, CAT level obtained and CBT score (p<0.001). This was interpreted as
indicating that students were not disadvantaged by the use of a CAT.
Table 2: Pearson’s Product Moment correlation between the scores and levels for participants in CBT and CAT sections of
two assessments (N = 133)
Variable Level
CAT2
% CBT1 % CAT1 % CBT2 % CAT2
Level CAT1
Pearson’s
R Sig
0.633
p<0.001
0.832
p<0.001
0.498
p<0.001
0.537
p<0.001
0.566
p<0.001
Level CAT2
Pearson’s
R Sig
*
0.541
p<0.001
0.329
p<0.001
0.800
p<0.001
0.696
p<0.001
% CBT1
Pearson’s
R Sig
*
*
0.329
p<0.001
0.467
p<0.001
0.499
p<0.001
% CAT1
Pearson’s
R Sig
*
*
*
0.379
p<0.001
0.428
p<0.001
% CBT2
Pearson’s
R Sig
*
*
*
*
0.595
p<0.001
Table 3 shows the performance of students on assessments 1 and 2 according to their cognitive styles.
Students were assigned as Wholist, Intermediate or Analytic according to their score on Ridings WA
dimension (Wholist < 0.99, Intermediate = 0.99-1.28, Analytic > 1.28) and as Verbaliser, Bimodal or
Imager based on Riding’s VI dimension (Verbaliser< 1.02, Bimodal = 1.02–1.14, Imager > 1.14).
When analysing the data provided in Table 3, it is essential to bear in mind that there is no correlation
between cognitive styles and intellectual skills (Armstrong, 2000).
Table 3: Mean scores obtained by students in assessments 1 AND 2 according to Riding’s Cognitive Styles Analysis
(N=133)
Cognitive Assessment 1 Assessment 2
Style %CBT1 %CAT1 Level
CAT1
%CBT2
%CAT2 Level
CAT2
VI Dimension Verbaliser 52 61 -0.87 45 54 -0.83 Bimodal 51 56 -0.91 41 50 -1.06 Imager 52 62 -0.72 41 54 -0.84 WA Dimension Wholist 53 58 -0.79 41 52 -0.99 Intermediate 52 63 -0.81 41 53 -0.87 Analytic 49 59 -0.90 44 54 -0.87
A repeated measures ANOVA was performed on the data summarized in Table 3. No significant
differences were found in the analysis in the overall performance of Verbalisers and Imagers or
Wholists and Analytics. No significant effects on the performance in any of the assessments due to
cognitive style were found. This is an important finding, as it suggests that learners with different
cognitive styles are not disadvantaged by either CAT or indeed CBT.
8. DISCUSSION AND FUTURE WORK
The work presented here is part of an ongoing research into the value of CATs in Higher Education. In
this work, the development and evaluation of a CAT prototype (Lilley and Barker, 2002; Lilley et al.,
2002a) and the potential benefits of this approach in a Business Administration distance-learning
domain (Lilley et al., 2002b) were investigated.
Findings from this preliminary research indicated that there is a range of advantages to using CATs and
indeed CBTs, such as timely feedback to students, reducing academic staff marking workload and
speed and accuracy of marking. Such advantages are particularly significant in distance learning
programmes, as the participant members of the academic staff often indicate that the time taken in on-
going monitoring, technical trouble-shooting, administering and preparing website learning materials
and increased asynchronous communication requests is far greater than anticipated.
Notwithstanding the well-known advantages listed earlier, a previous study by Maia and Meirelles
(2002) on Brazilian distance learning programmes showed that, while only 33.4% of the state-owned
Universities assessed their students at a distance, none of the private-owned institutions did so.
The implementation of distance learning programmes is being viewed as a viable alternative to widen
participation in Higher Education in Brazil. This is due to the fact that distance-learning education
costs are typically lower than those presented by traditional education. Moreover, in distance-learning
education the students work at their own pace, which could facilitate the coexistence of work and
study, especially for those students needing to work during or immediately after completion of
secondary school (Meirelles and Maia, 2001). Given that most of the distance-learning programmes
require students to have access to the Internet, one of the challenges for these programmes would be the
offer of a set of web-based applications to support the full exploration of this digital path.
In terms of computer-delivered assessments at a distance, it is our belief that web-based CATs have the
potential to offer a practical and pedagogically valuable solution, and our view is shared by Wainer
(2000). This is due to the fact that the results from our research to date suggest that, despite some
limitations, the CAT application introduced here was able to exhibit all the advantages of a traditional
CBT and, in addition, provided more motivation for learners, more information for tutors and offered a
useful measure of student ability. The limitations of the CAT approach are usually related to the
greater effort required implementing an adaptive algorithm and a much larger and calibrated questions
database than the effort necessary to implement a traditional CBT. We argue that this potential pitfall
can be overcome by both advantages listed earlier and further benefits, such as increased security.
Given that all questions administered during a CAT session of assessment are interactively selected, the
probability that one student will be answering exactly the same set of questions as any other is
diminished. This characteristic creates a barrier to unauthorised collaboration amongst students.
In addition to the CAT qualities of speed, accuracy, interactivity and increased security, the findings
from this study suggest that learners with different cognitive styles are not disadvantaged by the use of
CATs. This is an important result, as a study performed in the domain of Business Administration by
Armstrong (2000) suggests that students whose cognitive style is Analytic, tend to perform better than
their Wholist counterparts on written assignment formats in which systematic analysis and evaluation
of information are required. Armstrong (2000) also indicates that, although considerable progress has
been made in terms of matching cognitive styles with learning materials, no consistent effort has yet
been made to investigate the correlation between different cognitive styles and assessment
performance. Like Armstrong (2000), we believe this is an important area for further research, given
that all students should be given equal opportunity to achieve a high degree classification, regardless of
their cognitive style. Furthermore, if the assessment criteria used in a certain programme favour one
cognitive style over others, this fact may have implications for students as well as those potential
employers who use degree classification as their main selection criterion.
Finally, it is our conviction that the CAT approach has the potential to positively impact the way in
which assessments are undertaken in Higher Education distance learning programmes in Brazil. The
quality and speed of feedback provided to students can be improved by the CAT approach. Moreover,
the extent to which academic staff are aware of their students’ progress and weaknesses might be
enhanced. It is important to emphasise that to be fully effective, student assessment should be a regular
activity, and not an isolated assignment or exam at the end of the course. In order to maximise the
benefits for students, the assessment process should also include different delivery media and a wide
and well-balanced range of delivery methods, such as CATs, practical projects, posters, essays and so
on.
9. REFERENCES
Armstrong, S. J. The influence of individual cognitive style on performance in Management Education.
Educational Psychology, 20, 3, pp 323-340, 2000.
Barker, T and Barker, J. The evaluation of complex, intelligent, interactive, individualised human-
computer interfaces: What do we mean by reliability and validity? Proceedings of the European
Learning Styles Information Network Conference, University of Ghent, 2002.
Conole, G. and Bull, J. Pebbles in the Pond: Evaluation of the CAA Centre. Proceedings of the 6th
Computer-Assisted Assessment Conference, Loughborough, pp 63-73, 2002.
Douglas, G. and Riding, R. J. The effect of cognitive styles and position of prose passage title on
recall. Educational Psychology, 13, pp 385-393, 1993.
Hambleton, R. K. Fundamentals of Item Response Theory. California: Sage Publications Inc, 1991.
Harvey, J. & Mogey, N. Pragmatic issues when integrating technology into the assessment of students
in Brown, S., Race, P. and Bull, J. Computer-Assisted Assessment in Higher Education. London:
Kogan Page, 1999.
Jolliffe, A., Ritter, J. and Stevens, D. The online learning handbook: developing and using web-based
learning. London: Kogan Page, 2001.
Lilley, M. and Barker, T. The Development and Evaluation of a computer-adaptive Testing
Application for English Language. Proceedings of the 6th Computer-Assisted Assessment Conference,
Loughborough, pp 169-184, 2002.
Lilley, M., Barker, T., Bennett, S. and Britton, C. How computers can adapt to user's knowledge: a
comparison between traditional computer-based and computer-adaptive tests. Proceedings of the
International Conference on Information and Communication Technologies in Education (ICTE2202),
Badajoz, Spain, pp 701-705, 2002a.
Lilley, M.; Barker, T. & Maia, M. Web-based adaptive testing in distance learning: an overview.
Proceedings of the V Simpósio de Administração da Produção, Logística e Operações Internacionais,
São Paulo, 2002b.
Lord, F.M. Applications of Item Response Theory to practical testing problems. New Jersey: Lawrence
Erlbaum Associates, 1980.
Maia, M. de C. and Meirelles, F. S. Educação à Distância: Modelos Pedagógicos das Universidades.
Proceedings of the V Simpósio de Administração da Produção, Logística e Operações Internacionais,
São Paulo, 2002.
Meirelles, F. S. & Maia, M. de C. Educação a Distância: o caso da Open University. Proceedings of
the IV Simpósio de Administração da Produção, Logística e Operações Internacionais, Guarujá, 2001.
Pritchett, N. Effective Question Design in Brown, S., Race, P. and Bull, J. Computer-Assisted
Assessment in Higher Education. London: Kogan Page, 1999.
Riding, R. J. Cognitive Style Analysis. Birmingham Learning and Training Technology, 1991a.
Riding, R. J. Learning Style and Technology-Based Training. Sheffield: Department for Education
and Employment, 1996.
Riding, R. J. and Saddler-Smith, E. Type of Instructional Material, Cognitive style and Learning
Performance. Educational Studies, 18, 3, pp 323-340, 1992.
Riding, R. J. and Watts, M. The effect of cognitive style on the preferred formats on instructional
material. Learning Styles and Strategies (Riding, R. J. and Rayner, S. G. eds.). Educational
Psychology, 17, 1/2, pp 179-184, 1997.
Riding, R. J. Cognitive Style Analysis, Users Manual. Birmingham Learning and Training Technology,
1991b.
Riding, R. J. On the nature of cognitive style. Discussion paper for learning Styles Workshop April
1996, Assessment Research Unit, University of Birmingham, 1997.
Wainer, H. CATs: Whither and whence. Psicológica: revista de metodología y psicología
experimental, 21(1), 121-133, 2000.
KEY WORDS: Computer-adaptive Test, Item Response Theory, Distance Learning, Computer-
delivered Assessment, Cognitive Style, Business Administration