Do Cognitive Styles of Learning Affect Student Performance in Computer-Adaptive Testing?

DO COGNITIVE STYLES OF LEARNING AFFECT STUDENT PERFORMANCE IN

COMPUTER-ADAPTIVE TESTING?

MARIANA LILLEY

TREVOR BARKER

MARTA DE CAMPOS MAIA

SUMMARY

1. INTRODUCTION

2. COGNITIVE STYLE

3. COMPUTER-BASED AND COMPUTER-

ADAPTIVE TESTING

4. ITEM RESPONSE THEORY

5. PROTOTYPE DEVELOPMENT AND

EVALUATION

6. THE STUDY

7. RESULTS

8. DISCUSSION AND FUTURE WORK

9. REFERENCES

1. INTRODUCTION

This paper marks a further progression on

research previously done by Fundação Getúlio

Vargas on distance learning and University of

Hertfordshire on the use of computer-adaptive

tests in Higher Education (Lilley et al., 2002b).

Previous work by the authors discussed how the

number of distance-learning students in Brazil

has been significantly increasing since 1997

(Meirelles and Maia, 2001), and how this

growth has led to an increased interest from

both teaching staff and educational researchers

as to the full exploration of the digital path

provided by the Internet (Lilley et al., 2002b).

This study focused on computer-delivered

assessments, and its main purpose was to

compare two types of computer-delivered

assessments: computer-based tests and

computer-adaptive tests.

Whilst in a traditional computer-based test the

questions presented during a given assessment

session are not tailored for the specific ability

of an individual student, in a computer-adaptive

test the questions are selected dynamically for

each student, based on how well the student had

answered all previous questions administered

during that session of assessment. The findings

from this preliminary study suggested that the

main benefits of a computer-adaptive test over

a traditional computer-based test in the context

of Production and Operations Management

Education ranged from higher levels of

interaction and motivation amongst students to

the opportunity to bring academic assessment

practices in line with the practices of leading

organisations, such as the Graduate

Management Admission Test (GMAT) and

Microsoft Certified Professional (MCP).

The popularisation of computer-delivered assessments within Higher Education is typically associated

with the potential benefits offered by this approach, such as widening of assessment methods, ability to

mark large numbers of assessments without additional workload to the lecturers and opportunity to

provide students with immediate feedback on their performance (Conole and Bull, 2002; Jolliffe et al.,

2001; Harvey and Mogey, 1999).

At present most computer-delivered assessments are based on text-only, serially delivered objective

questions (e.g. multiple-choice question). Thus it was important to ensure that such assessments are a

fair measure of ability and that individual learners are not disadvantaged by this method, regardless

their cognitive style. Although Barker and Barker (2002) did not find any differences between

Verbalisers and Imagers (Riding, 1991a) in more traditionally delivered assessments, it is possible that

the text-based computer-delivered assessments might disadvantage some students along the VI

dimension.

The next section of this paper provides a brief introduction to cognitive styles of learning, as proposed

by Riding (1991a). In addition, we describe the development of a computer-adaptive test (CAT) and

compare it to traditional computer-based tests (CBTs). We also investigate whether or not cognitive

styles of learning have the potential to be an important factor influencing student performance when

participating in a computer-adaptive test. To this end, 133 students were tested using the Cognitive

Style Analysis (CSA) software. Their cognitive styles were then compared with their performance in a

computer-assisted assessment, which comprised both non-adaptive (i.e. CBT) and adaptive (i.e. CAT)

elements. In the final section of this paper, we discuss how the findings from this study can be applied

within the domain of Business Administration distance learning.

2. COGNITIVE STYLE

Riding (1991a) has described two bipolar dimensions of cognitive style, which are Wholist-Analytic

(WA) and Verbaliser-Imager (VI). Both dimensions are illustrated in Figure 1.

Figure 1: The two Cognitive Style dimensions

A simple computer-based application, namely Cognitive Style Analysis (CSA), has been developed by

Riding (1991b) in order to classify learners according to their position along the WA and VI

dimensions. The two styles or dimensions are independent of each other, so that the position of an

individual on one scale does not influence their position on the other.

Riding’s WA dimension describes whether individuals process information in wholes or in parts.

Wholists tend to see the whole of a picture and appreciate the whole of the context in one go.

Notwithstanding this ability, they have difficulty in appreciating the parts. Analytics are capable of

seeing the details of a situation, but may have difficulty in appreciating how the different parts relate to

each other. Figures 2 and 3 respectively illustrate how a scenario or information might be handled by

an Analytic and a Wholist individual (Riding, 1996).

Figure 2: Analytic View Figure 2: Wholist View

Riding’s VI dimension classifies whether individuals represent information during thinking in words or

pictures. Verbalisers tend to organize thinking as associations of words whereas Imagers tend to think

and handle mental information in the form of pictures. Riding suggests that this is likely to influence

the mode of presentation they prefer and the type of tasks they find easy. Riding also relates the focus

of mental activity as internal for Imagers and external for Verbalisers (Riding, 1991a). Individuals

whose focus is external tend to be more socially aware and favour stimulating environments.

Conversely, individuals whose focus is internal have a tendency to be less socially aware, favouring

less dynamic environments.

Riding and others have provided extensive research to relate the WAVI cognitive styles to learning

processes. For example, Riding and Sadler-Smith (1992) were able to show that the effectiveness of a

learning package presented in text or image format could be related to learning style. There was a

significant interaction between the application presentation style and learning style when pre-test and

post-test scores were compared (Riding and Sadler-Smith, 1992). Riding and Watts (1997) studied the

effect of cognitive style on the preferred formats of learning materials. They found that Imagers were

more likely to select picture-based presentations than text based ones. Verbalisers were more likely to

choose text-based presentations (Riding and Watts, 1997). Further work from Douglas and Riding

(1993) suggests that performance in different presentation modes closely matches learning style. For

example, Imagers performed better than Verbalisers in text plus picture presentations. In contrast,

Verbalisers performed better than Imagers when text was presented without pictures.

In the study introduced here, each student’s cognitive style was determined along the WA and VI

dimensions prior to taking the course, by means of the computer-delivered CSA test developed by

Riding (1991b). This test is simple and requires users to distinguish between true and false statements

about various objects and to locate and distinguish between information presented graphically. Based

on the test, which takes approximately 10 minutes to complete, users were assigned a score along

Riding’s WA and VI dimensions.

An outline of two types of computer-delivered assessments, namely computer-based test (CBT) and

computer-adaptive test (CAT) is presented next.

3. COMPUTER-BASED AND COMPUTER-ADAPTIVE TESTING

Typical CBTs are based on the use of objective questions. These objective questions are devised in

such a way that their marking does not rely on any subjective judgment on the part of the marker. A

further characteristic of CBTs would be that all students are subjected to the same set of questions

during the test. In well-designed CBTs, the questions administered present different levels of

difficulty, ranging from easier questions requiring only the recall of information to more difficult

questions in which higher intellectual skills such as analysis are needed (Pritchett, 1999).

One of the limitations of the CBT approach is the fact that students who are more able may feel

unmotivated to answer questions that are too easy, as these do not present an appropriate level of

challenge to their current ability. In a similar way, questions that are too difficult may lead less able

students to answer these questions by guessing rather than reasoning.

In earlier work by Lilley et al. (2002a) it was identified that CBTs are typically constructed on the

premise that a broad range of abilities should be assessed. To this end, the test contains questions of

several difficulty levels to match all these different levels of ability. In so doing, more able students are

required to answer questions that are below their level of ability before they are presented with

questions that are challenging. In the same vein, less able students start answering questions that are

likely to be appropriate for their level of ability and then, at a later stage, are presented with questions

that can be deemed too difficult and hence frustrating and even intimidating. In both cases, questions

that are not appropriate for the level of ability of a particular student provide little or no valuable

information about this student.

Lilley and colleagues (2002b) suggest that an alternative to conventional CBTs is the use of CATs.

The underlying principle within CATs is that questions that offer an appropriate level of challenge for

an individual student provide more information about this student than questions that are either too

difficult or unchallenging. In the next section of this paper we describe how Item Response Theory is

applied within the CAT application introduced here.

4. ITEM RESPONSE THEORY

Computer-adaptive tests are based on Item Response Theory (IRT), which consists of a family of

mathematical models that can be used in the design and implementation of educational tests. The

model we selected to use in our application is the Three-Parameter Logistic Model (3-PL) and its

mathematical expression is shown in Equation 1 (Lord, 1980):

Equation 1: The Three-Parameter Logistic Model

)(7.11

1)(

bae

ccP

In Equation 1, represents a student’s ability. As the model’s name implies, the 3-PL Model

comprises three parameters: (i) b, which represents the level of difficulty of the question; (ii) a, which

corresponds to the question’s discrimination and hence determines the usefulness of a question when

distinguishing among students near an ability level (Hambleton, 1991) and (iii) c, which denotes the

probability of a student answering the question correctly by chance. Finally, P( ) corresponds to the

probability P of a student with ability answering a question of difficulty b correctly.

In a CAT, the questions to be administered during a given session of assessment are based on the

student’s performance on previous questions. In general terms, when a correct response is given, the

adaptive algorithm statistically estimates the student’s ability as higher than previously estimated, and

then presents a more difficult question that closely matches that new higher estimate of the student's

ability. Conversely, if the response provided is incorrect, the adaptive algorithm estimates the student’s

ability as being lower and then presents an easier question that matches that new lower estimate.

The process of administering questions, evaluating responses and selecting the next question to be

presented is repeated until a stop condition has been met. In fixed-length CATs, the most common stop

conditions are duration (i.e. a certain time limit has been reached) or number of questions answered. In

variable-length CATs, the test usually stops when a certain standard error for the estimate of the

student’s ability has been met.

The following section presents the main milestones regarding the development and evaluation of the

CAT prototype introduced here.

5. PROTOTYPE DEVELOPMENT AND EVALUATION

The adaptive algorithm used in the prototype we developed was based on the 3-PL Model from IRT.

This model was chosen over other models for dichotomously scored questions, such as the One and

Two Parameter Logistic models, as it takes into consideration the possibility of a student giving a

correct response by guessing (Lord, 1980).

In order to verify the feasibility of the CAT approach in a Higher Education context, we designed and

constructed a high-fidelity prototype for initial testing of our ideas. This prototype consisted of a

Graphical User Interface and a 250 objective questions database in the domain of English as second

language.

This prototype was subjected to a heuristic evaluation involving 11 experts in the subject domain and in

Computer Science. In this evaluation, the experts used a set of guidelines to evaluate the prototype.

The fact that no major usability problems were found was taken to indicate that the prototype was

usable and that the interface imposed no significant barriers to assessment. The panel of experts also

evaluated the prototype’s effectiveness as a computer-delivered assessment tool in an academic

context. The findings from this evaluation are described in full by Lilley and Barker (2002).

It was found that the qualities of speed and accuracy of marking plus the opportunity to provide the

students with immediate feedback typically associated with a CBT remained unchanged when experts

were using the CAT. The CAT approach was reported as being potentially capable of offering higher

levels of individualisation than offered by a traditional CBT. This was due to the fact that the

application was able to select the most appropriate questions to be presented to each individual student

based upon their ability, as dynamically measured during the test. In addition to expert evaluation, a

range of trials was conducted with users. These involved studies of user performance on tests, online

questionnaires and structured focus groups (Lilley and Barker, 2002; Lilley et al., 2002a). No major

usability or other problems were identified in these studies.

In the following section we summarise how the study was carried out.

6. THE STUDY

A group of 133 students enrolled in a second year Visual Basic programming course at the University

of Hertfordshire participated in assessments using the CAT application we developed. The application

used for the assessments was an enhanced version of the high-fidelity prototype described in the

previous section. In order to facilitate better formatting of questions, the prototype’s interface was

modified to support new graphical elements. In addition, a new database containing 119 questions

covering the module syllabus was created and calibrated by subject experts. A non-adaptive element

(i.e. CBT) was added to the application. In so doing, we expected not only to gather valuable

information for comparative purposes but also to guarantee that the test would be fair to all

participants. Figure 4 illustrates how the application was structured.

Figure 4: Flowchart illustrating how the application was structured

Participants took the tests as part of their regular assessment in a second year Higher National Diploma

course in Computer Science. Both sessions of assessment were conducted under supervised conditions

in computer laboratories. Immediately prior to the first assessment, all participants took Riding’s CSA

test under supervised conditions. A trained administrator gave each participant a short introduction to

the test as recommended by Riding (1991a).

As for assessment 1, this took place in week 7, and comprised 10 non-adaptive questions (i.e. CBT

mode) followed by 10 adaptive ones (i.e. CAT mode). Assessment 2 took place during week 10 and

comprised 10 CBT questions followed by 20 CAT ones. Prior to the first session of assessment using

the software, students were given a brief introduction to the use of the software, but were not informed

of the existence of two sections within the test (i.e. CBT followed by CAT). In both sessions of

assessment, the order in which the CBT questions were presented was randomly selected, as an attempt

to minimise unauthorised collaboration amongst students.

The stop condition in the application described here is a combination of duration and number of

questions responded. The test is ended either when a certain time limit has been reached or

immediately after a certain number of questions have been administered, whichever occurs first. In the

application introduced here, when the stop condition has been met, the test is ended and the last

estimated level of ability is taken to be the student’s ability for that given session of assessment. The

level of ability ranged from –2 (lowest) to +2 (highest), and from now on this value will be referred

as to “Level CAT”.

The results obtained by the students in assessments 1 and 2 are presented and analysed in the following

section of this paper.

7. RESULTS

A summary of student performance in both assessments is presented in Table 1. In Tables 1 and 2, the

columns “%CBT” and “%CAT” represent the percentage of correct responses in that part of the

assessment.

Table 1: Mean scores obtained by students in assessments 1 and 2 (N=133)

Assessment 1 Assessment 2

%CBT1 %CAT1 Level

CAT1

%CBT2 %CAT2 Level

CAT2

51.5 59.9 -0.83 42.3 53.0 -0.91

A Pearson’s Product Moment correlation was performed on the scores summarised in Table 1. The

results of this correlation are shown in Table 2.

The results obtained from the statistical analysis shown in Table 2 show that there is a good correlation

between the CAT score, CAT level obtained and CBT score (p<0.001). This was interpreted as

indicating that students were not disadvantaged by the use of a CAT.

Table 2: Pearson’s Product Moment correlation between the scores and levels for participants in CBT and CAT sections of

two assessments (N = 133)

Variable Level

CAT2

% CBT1 % CAT1 % CBT2 % CAT2

Level CAT1

Pearson’s

R Sig

0.633

p<0.001

0.832

p<0.001

0.498

p<0.001

0.537

p<0.001

0.566

p<0.001

Level CAT2

Pearson’s

R Sig

*

0.541

p<0.001

0.329

p<0.001

0.800

p<0.001

0.696

p<0.001

% CBT1

Pearson’s

R Sig

*

*

0.329

p<0.001

0.467

p<0.001

0.499

p<0.001

% CAT1

Pearson’s

R Sig

*

*

*

0.379

p<0.001

0.428

p<0.001

% CBT2

Pearson’s

R Sig

*

*

*

*

0.595

p<0.001

Table 3 shows the performance of students on assessments 1 and 2 according to their cognitive styles.

Students were assigned as Wholist, Intermediate or Analytic according to their score on Ridings WA

dimension (Wholist < 0.99, Intermediate = 0.99-1.28, Analytic > 1.28) and as Verbaliser, Bimodal or

Imager based on Riding’s VI dimension (Verbaliser< 1.02, Bimodal = 1.02–1.14, Imager > 1.14).

When analysing the data provided in Table 3, it is essential to bear in mind that there is no correlation

between cognitive styles and intellectual skills (Armstrong, 2000).

Table 3: Mean scores obtained by students in assessments 1 AND 2 according to Riding’s Cognitive Styles Analysis

(N=133)

Cognitive Assessment 1 Assessment 2

Style %CBT1 %CAT1 Level

CAT1

%CBT2

%CAT2 Level

CAT2

VI Dimension Verbaliser 52 61 -0.87 45 54 -0.83 Bimodal 51 56 -0.91 41 50 -1.06 Imager 52 62 -0.72 41 54 -0.84 WA Dimension Wholist 53 58 -0.79 41 52 -0.99 Intermediate 52 63 -0.81 41 53 -0.87 Analytic 49 59 -0.90 44 54 -0.87

A repeated measures ANOVA was performed on the data summarized in Table 3. No significant

differences were found in the analysis in the overall performance of Verbalisers and Imagers or

Wholists and Analytics. No significant effects on the performance in any of the assessments due to

cognitive style were found. This is an important finding, as it suggests that learners with different

cognitive styles are not disadvantaged by either CAT or indeed CBT.

8. DISCUSSION AND FUTURE WORK

The work presented here is part of an ongoing research into the value of CATs in Higher Education. In

this work, the development and evaluation of a CAT prototype (Lilley and Barker, 2002; Lilley et al.,

2002a) and the potential benefits of this approach in a Business Administration distance-learning

domain (Lilley et al., 2002b) were investigated.

Findings from this preliminary research indicated that there is a range of advantages to using CATs and

indeed CBTs, such as timely feedback to students, reducing academic staff marking workload and

speed and accuracy of marking. Such advantages are particularly significant in distance learning

programmes, as the participant members of the academic staff often indicate that the time taken in on-

going monitoring, technical trouble-shooting, administering and preparing website learning materials

and increased asynchronous communication requests is far greater than anticipated.

Notwithstanding the well-known advantages listed earlier, a previous study by Maia and Meirelles

(2002) on Brazilian distance learning programmes showed that, while only 33.4% of the state-owned

Universities assessed their students at a distance, none of the private-owned institutions did so.

The implementation of distance learning programmes is being viewed as a viable alternative to widen

participation in Higher Education in Brazil. This is due to the fact that distance-learning education

costs are typically lower than those presented by traditional education. Moreover, in distance-learning

education the students work at their own pace, which could facilitate the coexistence of work and

study, especially for those students needing to work during or immediately after completion of

secondary school (Meirelles and Maia, 2001). Given that most of the distance-learning programmes

require students to have access to the Internet, one of the challenges for these programmes would be the

offer of a set of web-based applications to support the full exploration of this digital path.

In terms of computer-delivered assessments at a distance, it is our belief that web-based CATs have the

potential to offer a practical and pedagogically valuable solution, and our view is shared by Wainer

(2000). This is due to the fact that the results from our research to date suggest that, despite some

limitations, the CAT application introduced here was able to exhibit all the advantages of a traditional

CBT and, in addition, provided more motivation for learners, more information for tutors and offered a

useful measure of student ability. The limitations of the CAT approach are usually related to the

greater effort required implementing an adaptive algorithm and a much larger and calibrated questions

database than the effort necessary to implement a traditional CBT. We argue that this potential pitfall

can be overcome by both advantages listed earlier and further benefits, such as increased security.

Given that all questions administered during a CAT session of assessment are interactively selected, the

probability that one student will be answering exactly the same set of questions as any other is

diminished. This characteristic creates a barrier to unauthorised collaboration amongst students.

In addition to the CAT qualities of speed, accuracy, interactivity and increased security, the findings

from this study suggest that learners with different cognitive styles are not disadvantaged by the use of

CATs. This is an important result, as a study performed in the domain of Business Administration by

Armstrong (2000) suggests that students whose cognitive style is Analytic, tend to perform better than

their Wholist counterparts on written assignment formats in which systematic analysis and evaluation

of information are required. Armstrong (2000) also indicates that, although considerable progress has

been made in terms of matching cognitive styles with learning materials, no consistent effort has yet

been made to investigate the correlation between different cognitive styles and assessment

performance. Like Armstrong (2000), we believe this is an important area for further research, given

that all students should be given equal opportunity to achieve a high degree classification, regardless of

their cognitive style. Furthermore, if the assessment criteria used in a certain programme favour one

cognitive style over others, this fact may have implications for students as well as those potential

employers who use degree classification as their main selection criterion.

Finally, it is our conviction that the CAT approach has the potential to positively impact the way in

which assessments are undertaken in Higher Education distance learning programmes in Brazil. The

quality and speed of feedback provided to students can be improved by the CAT approach. Moreover,

the extent to which academic staff are aware of their students’ progress and weaknesses might be

enhanced. It is important to emphasise that to be fully effective, student assessment should be a regular

activity, and not an isolated assignment or exam at the end of the course. In order to maximise the

benefits for students, the assessment process should also include different delivery media and a wide

and well-balanced range of delivery methods, such as CATs, practical projects, posters, essays and so

on.

9. REFERENCES

Armstrong, S. J. The influence of individual cognitive style on performance in Management Education.

Educational Psychology, 20, 3, pp 323-340, 2000.

Barker, T and Barker, J. The evaluation of complex, intelligent, interactive, individualised human-

computer interfaces: What do we mean by reliability and validity? Proceedings of the European

Learning Styles Information Network Conference, University of Ghent, 2002.

Conole, G. and Bull, J. Pebbles in the Pond: Evaluation of the CAA Centre. Proceedings of the 6th

Computer-Assisted Assessment Conference, Loughborough, pp 63-73, 2002.

Douglas, G. and Riding, R. J. The effect of cognitive styles and position of prose passage title on

recall. Educational Psychology, 13, pp 385-393, 1993.

Hambleton, R. K. Fundamentals of Item Response Theory. California: Sage Publications Inc, 1991.

Harvey, J. & Mogey, N. Pragmatic issues when integrating technology into the assessment of students

in Brown, S., Race, P. and Bull, J. Computer-Assisted Assessment in Higher Education. London:

Kogan Page, 1999.

Jolliffe, A., Ritter, J. and Stevens, D. The online learning handbook: developing and using web-based

learning. London: Kogan Page, 2001.

Lilley, M. and Barker, T. The Development and Evaluation of a computer-adaptive Testing

Application for English Language. Proceedings of the 6th Computer-Assisted Assessment Conference,

Loughborough, pp 169-184, 2002.

Lilley, M., Barker, T., Bennett, S. and Britton, C. How computers can adapt to user's knowledge: a

comparison between traditional computer-based and computer-adaptive tests. Proceedings of the

International Conference on Information and Communication Technologies in Education (ICTE2202),

Badajoz, Spain, pp 701-705, 2002a.

Lilley, M.; Barker, T. & Maia, M. Web-based adaptive testing in distance learning: an overview.

Proceedings of the V Simpósio de Administração da Produção, Logística e Operações Internacionais,

São Paulo, 2002b.

Lord, F.M. Applications of Item Response Theory to practical testing problems. New Jersey: Lawrence

Erlbaum Associates, 1980.

Maia, M. de C. and Meirelles, F. S. Educação à Distância: Modelos Pedagógicos das Universidades.

Proceedings of the V Simpósio de Administração da Produção, Logística e Operações Internacionais,

São Paulo, 2002.

Meirelles, F. S. & Maia, M. de C. Educação a Distância: o caso da Open University. Proceedings of

the IV Simpósio de Administração da Produção, Logística e Operações Internacionais, Guarujá, 2001.

Pritchett, N. Effective Question Design in Brown, S., Race, P. and Bull, J. Computer-Assisted

Assessment in Higher Education. London: Kogan Page, 1999.

Riding, R. J. Cognitive Style Analysis. Birmingham Learning and Training Technology, 1991a.

Riding, R. J. Learning Style and Technology-Based Training. Sheffield: Department for Education

and Employment, 1996.

Riding, R. J. and Saddler-Smith, E. Type of Instructional Material, Cognitive style and Learning

Performance. Educational Studies, 18, 3, pp 323-340, 1992.

Riding, R. J. and Watts, M. The effect of cognitive style on the preferred formats on instructional

material. Learning Styles and Strategies (Riding, R. J. and Rayner, S. G. eds.). Educational

Psychology, 17, 1/2, pp 179-184, 1997.

Riding, R. J. Cognitive Style Analysis, Users Manual. Birmingham Learning and Training Technology,

1991b.

Riding, R. J. On the nature of cognitive style. Discussion paper for learning Styles Workshop April

1996, Assessment Research Unit, University of Birmingham, 1997.

Wainer, H. CATs: Whither and whence. Psicológica: revista de metodología y psicología

experimental, 21(1), 121-133, 2000.

KEY WORDS: Computer-adaptive Test, Item Response Theory, Distance Learning, Computer-

delivered Assessment, Cognitive Style, Business Administration

Documents

Do Cognitive Styles of Learning Affect Student Performance in Computer-Adaptive Testing?