An interactive video test for pharmaceutical chemist's assistants

  • Published on

  • View

  • Download


  • Computers in Human Behavioc Vol. 10, pp. 51-62.1994 Printed in the U.S.A. All rights reserved.

    0747.5632i94 $6.00 + .OO Copyright 0 1993 Pergamon Press Ltd.

    An Interactive Video Test for Pharmaceutical Chemists Assistants

    Fred Bosman, Jacqueline Hoogenboom, and Geke Walpot

    C/TO, National Institute of Educational Measurement

    Abstract - CITO (Dutch acronym for Nationul Institute for Educational Measurement) has developed a test bused on the interactive possibilities of a videodisc. The test is used within the departments of pharmaceutical education in the senior secondary vocational schools for public health-care and welfure. The test measures the vocationul qualifications of students at the end of their schooling. The interactive test is an adequate solution for testing theoretical knowledge and practical skills in a simulated real-life situation. There is an indication that open actions of the students discriminate better between good and weak students than other types of questions.


    In vocational education and, more specifically, in the welfare-oriented sectors of secondary vocational education, it is hard to assess the whole complex of voca- tional qualifications of students. The aims of vocational education are often com- plex and more practical and attitudinal than purely cognitive in structure. In partic- ular, education for a number of specific vocational qualifications at a higher level is based on a complex structure of theoretical knowledge and practical skills. It is a serious problem for schools to establish a sound, real-life kind of assessment for these skills. Examples of skills which are difficult to test are an efficient use of a computer and a proper social intercourse with other people, as part of profes- sional duties.

    Whereas it is difficult to make a valid test of a combination of theoretical knowledge and practical skills in a real-life situation, it is relatively easy to make a reliable test for the cognitive aspects of this subject only. The problem is that the results of those tests only give partial information about a students ability to per-

    Requests for reprints should be addressed to Fred Bosman, CITO, P.O. Box 1034,680l MG Arnhem, The Netherlands.


  • 52 Bosnmtt, Hoogettboom, and Wmlpot

    form a professional job. A possible solution may be found in the interactive use of videodiscs (Laurillard, 1987). It is supposed that testing based on the interactive use of video is an appealing and efficient way of measuring learning results (Sales, 1989).

    Goals of the Project

    The first important goal of the project was to develop a reliable, valid, and attrac- tive test with the use of interactive videodisc (CITO, National Institute of Educational Measurement, 1990). We wanted to gain experience in developing interactive tests. In the near future, there will be a great demand for reliable, attrac- tive, and easy-to-use instruments for learning and testing (Ives, 1989). Therefore, we feel that it is necessary to keep up-to-date with, and explore further, the nature of interactive technologies.

    The second goal of this project was to stimulate thinking about the adjustment of a number of psychometric measures for interactive testing. Interactivity means that the problems in the test are related to the actions of the student. The consequence of this interactivity is that the test is not a classical test with identical questions for each student. This is a reason to develop another way of thinking about difficulty, reliability, and validity. It is not the intention of the project to establish a new test theory, but the results of the evaluation in schools will be used for future research.

    Background of the Project

    In The Netherlands, the departments of pharmaceutical chemists assistants form part of the schools for social welfare and public health service. These departments were selected for the application of a new kind of assessment of complex skills, The subject of the test is making and dispensing medicines to patients. Examples of skills in pharmaceutical practice, which can be assessed by means of an interac- tively used videodisc, are (a) explaining the use of the medicine in ordinary lan- guage, (b) interpreting computer messages, and (c) making right choices out of the stock of medicines with respect to the date of decay.

    The interactive video test1 for pharmaceutical chemists assistants consists of six cases in which the student has to serve the client. The test is described in the next section. The activities and decisions of the student are scored during the test. The third section describes the scoring and the reports on the effectiveness and efficien- cy of students actions. The results of the tryout, conducted with 150 students, are presented later in the article.


    The student is offered six cases simulating real-life situations in a pharmacy. For each case, her or his job is to handle the whole process related to concluding a pre- scription in a pharmacy, from the moment of presenting by a client to the delivery of the medicine. The student has to show an effective and efficient way of handling the job. Everything that has to be done in a real pharmacy is present in the test: (a) answering questions of patients at the counter, (b) putting questions to patients, (c) handling the administrative computer program, (d) making the right choice of medicines written on the prescription, and (e) delivering the medicines with the right information to the patient.

  • Inteructive video test 53

    All kinds of problems are included in the six prescriptions (each prescription is a separate case). For instance, prescriptions do not meet the legal requirements, patients data are not present in the files, different medicines interact in a dangerous way, doses are too high, medicines need a specific way of handling, and so forth.

    Example of a Case

    In one of the six cases, the old Mr. Van der Mey enters the shop with a prescription for his heart. Figure 1 presents the first computer screen of this case to illustrate the used interface. The student may ask questions such as: Do you want to wait for your medicines ? Van der Mey answers via the monitor: Is it possible to deliver it at my home? The student types: Yes, that is possible.

    Students can consult the chemist, the doctor, or a colleague when they think that they are not allowed to make an independent decision. The most relevant actions and reactions are worked out on video with actors. When the student asks a nonrel- evant question the client acts a bit surprised: What do you mean? If the student has obtained enough information about Van der Mey, a simulation of a computer program for pharmaceutical administration is used to fill in the prescription data. Thereafter, the medicines have to be prepared and delivered to Van der Mey. A stu- dent works individually for about one hour to complete one of the six cases.


    According to the criteria of Gery (1987), the test is highly interactive. The student is questioning the program and the program is answering (mostly in video and

    Figure 1. Mr. Van der Mey presents his prescriptions.

  • 54 Rosnum, Hoogenboonz, and Wdpoi

    audio), and vice versa. The program is determining the next step in the test depend- ing on the students answer. All the steps together make up the path of a student through the test.

    There are three different types of questions in the test:

    1. ~~~~~~~~ cfaoice ~c~~~~~#~=~. The question and the possible answers are presented by the program and the student has to make a choice using the mouse.

    2. Upen questions, The program formulates questions. For instance, the patient asks a question The student reacts by typing an answer in the text window. The program checks the answers with a prerecorded list of check-words.

    3. Open actions. There is no question fo~uIat~d by the program but the action is initiated by the student. For instance, making notes on the prescription or con- sultation of a physician, chemist, or colleague.

    The test has a ~o~ditionaliy branched structure. Starting from the prescription there are different alternatives for branching, depending on the type of questions and answers given. In the next node there are again a number of alternatives, and so on. All the steps together (the path) will usually be different for each student. One path, the so-called ideal path, represents the correct and most efficient way of proceeding through the task. The program as a whole has a very open struc- ture. If a student needs help there is the option of asking a hint or assistance from the chemist.

    After each case the student has the possibility to receive feedback. Feedback rlur- ing the test is contradictory with the classical way of testing. Feedback in interac- tive testing, however, is necessary (Cohen, 1984). So, feedback is presented both during the test and at the end of the test. If desired by the student or the teacher, the feedback repeats the case after its completion by playing it as if it was a movie. The case is supplemented with comments and remarks on the students actions, given by the chemist, The teacher is able to give a judgement and/or mark based on the whole set of actions. For instance, a teacher may give advice on what to revise and when to repeat the test. The next section focuses on the scoring of the test and the reporting of results.


    To offer the possibility of pacing the educational process by the test results, it is necessary to know exactly where deficiencies in knowledge and skills occur. The score of the test is therefore not given as one mark, but is divided into five content categories: (a) Accepting the prescription, (b) handling the computer program for pharmaceutical administration, (c) searching or preparing the medicine, (d) manag- ing the stock, and (e) delivering the prescription.

    The student obtains two scores for each content category: a judgement on e&c- tiveness and a judgement on @ciency. The effectiveness score shows to what extent a student has succeeded in solving the test problems regardless of the way in which those solutions were reached. The best way of solving a problem (i.e., the ideal path) consists of all the steps which have to be taken to serve a patient cor-

  • Interactive video test 55

    rectly. The efficiency score shows to what extent a student did her or his job effl- ciently. A high score indicates that the student did not make detours from the ideal path and did not repeat steps.


    Each step in the students path through the test is given a score. The scoring is real- ized as follows: (a) Each step is labelled with its content category, (b) crucial mis- takes are labelled for special audio feedback at the end of the session, and (c) each step is rated for its contribution to an ideal test result (the steps of the ideal path).

    If a step is necessary for an ideal test result, the step contributes both to the score for effectiveness and the score for efficiency. If the step is not necessary, only the efficiency score is counted. This means that a student who has done all the ideal steps, but with detours, obtains an effectiveness score of 100%; however, the effl- ciency score will be low.

    The process is illustrated with a simplified route of a hypothetical student, as depicted in Figure 2. The steps on the ideal path count for both the effectiveness score and the efftciency score. The steps on the detours count for the efficiency score only.

    Scores between 0 and 10 are possible. The right action or answer is marked with 10; a good answer, but not the right one, is marked with a number between 5 and 10; an indifferent, neutral answer is marked with 5; an insufficient but not totally wrong answer is marked with a number between 0 and 5; and a completely wrong action is marked with 0. The full scale of marks is neither present nor rea-

    [ 10 I= student has to ask for the name of the patient who is collecting his medicine.

    [ 2 0 I =

    i 1

    student has to give an answer to a question about how to store the

    i medicine.


    v (301

    4 .

    [ 3 0 ] = student has to tell the patient that the treatment must be completed.


    1 Ideal path

    1 Detour

    11 Skipped step

    Figure 2. A route of a hypothetical student.

  • 56 Bosnmn, Hoogenboom, and Walpot

    sonable for each step. Sometimes, only yes or no, - that is, right or wrong - is available.


    For each student, the effectiveness score is calculated as the total score of the steps on the ideal path, divided by the highest possible score on this ideal path (McGuire, Solomon, & Bashook, 1976). Multiplication of this score by 100 yields the per- centage of effectiveness. The score on the ideal path is established according to three principles: (a) If a student repeats the same ideal step, only the last step is scored; (b) an ideal step that is skipped is scored as 0, in conformity with the exist- ing procedure of testing practical skills; and (c) an answer given after a request for a hint is also scored as 0.

    The efficiency score is calculated as the students total number of scores that are higher than 5, divided by the sum of all ideal steps and all other steps. Again, mul- tiplication of the score by 100 yields the percentage of efficiency. All the steps are counted in calculating the efficiency, including the repeated steps.

    To give an example of calculation, Table 1 displays the steps taken by four stu- dents. Student 1 is an ideal student, taking the ideal path all along; Student 2 makes two mistakes; Student 3 skips Step 30 and makes a number of mistakes; and Student 4 does everything wrong. Figure 3 presents a visualization of the four paths.

    The ideal steps for these scores are 10, 20, and 30. The maximum score for this path is 30. Student 1 scores 30; Student 2 also scores 30 (please note that Step 20 is not counted the first time); Student 3 scores 7; and Student 4 scores 0. The effec- tiveness scores are then:

    Student 1: 30/30 * 100 = 100% Student 2: 30/30 * 100 = 100% Student 3: 7/30 * 100 = 23% Student 4: O/30 * 100 = 0%

    To calculate the efficiency score, all steps are counted. Student 1 has three scores that are higher than 5, there are three essential steps, and the student makes no detours. Student 2 has four scores that are higher than 5, and there are three essen-

    Table 1. The Routes of Four Students

    Student 1 Student 2 Student 3 Student 4 Step Score Step Score Step Score Step Score

    [lo] 10 [lo] 10 [101 0 [lOI 0 [20] 10 PO1 7 (15) 7 (151 0 [30] 10 (251 5 ;::i ; (16) 0

    [20] 10 (171 0 [30] 10 (25) 5

    ([30] not!) ;:56; :

    (17) 0

    ;:z; :

    [201 0

    ;22:j :

    (301 0

    Note. [] = ideal step, {} = detour.

  • student 1 student 2

    1101 I101

    Interactive video test

    student 3


    r- (15)


    i 1201

    5 u5)





    student 4


    Figure 3. The routes of four students in a flowchart.

    tial steps plus two extra steps, so that the denominator contains five steps. The same reasoning applies for Students 3 and 4. The efficiency scores are then:

    Student 1: 3/3 * 100 = 100% Student 2: 4/5 * 100 = 80% Student 3: 216 * 100 = 33% Student 4: O/13 * 100 = 0%


    The report has to give a clear overview of the skills that are mastered and those that are not. Reporting is not restricted to one integrated mark but is divided in the five content categories mentioned before. There are three kinds of reports: (a) a report of scores per student, (b) a report on serious mistakes, and (c) a report of scores per class. In this way a clear picture of the students ability is given. Figure 4 presents a screen-dump of the results of a student on effectiveness and efficiency.

    Apart from marks for effectiveness and efficiency for each student, serious mis- takes are mentioned in the feedback movie. These kinds of mistakes have to be prevented. Examples of serious mistakes are the choice of a wrong medicine, or an incorrect concentration or wrong quantity. These crucial mistakes are stated in the student report as follows: You made a very serious mistake: On the prescription of Mr. Van der Mey was stated DIGITOXIN. You delivered DIGOXIN to him. The class reports give teachers the opportunity to compare results of particular students with the average result of a whole class. It is also possible to determine a problem in a particular class - for instance, the topic of stock managing. It is possible for a teacher to print the reports afterwards.


    The test was evaluated at five schools using students at the end of their second or third school year. The hardware was placed in a separate room where one student

  • 58 ~osnwtr, Hoogetrboom, and Walpot

    Eigure 4. Screen-clump of the results of a student on effectiveness and efficiency.

    was abie to work i~d~v~dua~ly for two hours. When finished, the student called another student out of the class. The results of 143 students on one case of the test were analyzed.

    Results of the Effectiveness Score

    Table 2 gives an overview of the data from the anafysis. The results are given for three content categories; the categories stockkeeping and delivery do not occur in this case.

    The homogeneity of the total test (alpha = 33) is low considering the number of items. However, the total test is not a homogeneous entity; it consists of three parts (content categories). Unfortunately, the homogeneity of the sepamte parts is low too. But, if one considers that it is the intention that each student will do five cases, the estimated homogeneity with a test length five times longer seems acceptable, Actually, the reliabilities of the parts are not directly comparable because the test length is of inffuence; an extrapolation to a test with 50 items shows comparable values.

    What also deserves attention is the pretty low sturzd~rrl deviation. So, the test shows little discrimination. The low standard deviation appears to be the effect of a large number of items that is correctly answered by nearly all students. The test strives after a high similarity between the measured skills and the job skills. In view of this similarity, the high scores of the students are not a problem, because they indicate that the test actually measures the knowledge and practical skills mastered by each student. The mean diffmulty level shows that students score from 50% on

  • Interactive video test 59

    Table 2. Results of the Effectiveness Score

    Measure Accept Input Prepare

    Description Data Medicine Total

    Number of items Maximum score

    7 70

    20 200

    Homogeneity (alpha) .44 .55 alphas with length l 5 .80 .86 alphaa with 50 items .85 .75

    Standard deviation Mean difficulty levelb

    20.04 .67

    Standard error (SE) SE in % of max. score

    12.12 60

    9.05 12.90

    13.51 6.80

    24 51 260 530

    .39 .58

    .76 .87

    .57 .58

    20.38 35.53 .75 .70

    15.86 22.99 6.10 4.30

    espearman-Brown formula. bThis is a number between 0 and 1 that indicates the level of difficulty. It is calculated as the mean score divided by the maximum score.

    the content category accepting the prescription to 75% on the content category preparing/fetching the medicine. Thus, the mean difficulty level is moderate.

    In spite of the low homogeneity, the stun&~-~ error of the test scores seems acceptable. This is an effect of the low standard deviation. For example, a standard error of 6.1% of the maximum score is observed for the content category prepare medicine. To calculate the true report percentage of a student with 95% certainty, a percentage of 1.96 * 6.1 is added to and subtracted from the students own per- centage. Thus, the true score of a student with a score of 70% is, with 95% certain- ty, between 58% and 82% (a range of 24%).

    The Efficiency Score and Criteria for Classical Test Analysis

    The efficiency score does not match the requirements of classical test analysis, because not all students go through the same path. Due to its interactivity, the pro- gram reacts on both correct and incorrect actions. Measures such as homogeneity cannot be given with the available analysis techniques.

    Features of the /terns

    Three kinds of questions were used: (a) multiple choice questions, to be answered with the mouse; (b) open questions, such as questions from a client or questions from the computer simulation to put in the data of a client; and (c) open actions - for example, actions that the student has to perform on his or her own initiative (thus, the student can omit such actions). In Table 3, difficulty levels and item-test correlations are given for the three different kinds of questions. The item-test cor- relation is used here to derive the ability of discrimination of the type of question. This measure says something about the relationship between the question and the test as a whole. The item-part correlations give the relationship between the score on a question and the score on the particular content category.

    It is not possible to draw firm conclusions because the number of questions is too low to obtain significant results. The multiple choice questions and open ques- tions seem to be easier than the open actions. In addition, the open actions discrim- inate better between good and weak students than both other types of ques- tions. With regard to the difficulty level this outcome was to be expected, because the students can overlook an open action (then, the score is zero).

  • 60 Bosnmn, Hoogenhoonz, and Walpot

    Table 3. Data From the Test Analysis Related to the Kind of Question

    Kind of question

    Multiple choice with mouse 22 0.89 0.12 0.16 Open question, short answer 9 G.78 0.11 0.23 Open action 20 0.51 0.25 0.38

    aMean difficulty level. bltern-test correlaCon. Cltem-part correlation


    Classical test analysis is used in case one wants to know if a test is reliable. In the following, we discuss if an interactive test can be qualified as reliable with a classical test analysis. Three criteria of such an analysis are:




    Each student makes the same items. In our interactive test, each student scores on each step that counts for the effectiveness score, so that this criterion is met. However, this criterion is not met for the efficiency score. The test should be taken under the same circumstances and students should have the same aids at their disposal. This criterion was not controlled during the tryout. Items must be mutually independent; that is, the chance to correctly answer an item depends on the students proficiency and not on the answers given to other items.

    As far as independence between items in the interactive test is concerned, there are some problems. As an example, Figure 5 presents the routes of two students. As may be seen from this figure, Gaby performs Step 40 right, and William per- forms Step 40 wrong. He does the extra Step 41 in a detour, in which he receives information +A, and he arrives again at Step 40. This time he performs Step 40 correctly. The effectiveness scores of Gaby and William are both 10. However, Gaby was not aware of information +A, and therefore Step 40 could have been more difficult for her than for William. Step 40 can be interpreted as a black box: Everyone has the possibility to perform extra steps in a detour, and the points are scored without examining how the student managed to do that. But, if one does not




    I v v A + II-1

    1401 (411 L --,_-I


    Figure 5. The routes of Gaby and William.

  • Interactive video test 61

    consider Step 40 as a black box, the items in the test are no longer independent. As a conclusion, the independence of the test is disputable.

    To conclude, there is no certainty of a complete match with the criteria for clas- sical test analysis. Whereas the students make the same items for the effectiveness score, independence is disputable, and there is no information about comparable circumstances. Thus, the data from the test analysis should be seen as indications and not as experimental data.


    The first important goal of the project was to develop a valid and attractive test with the use of an interactive videodisc. Given our experiences, an interactive test using audio and video is able to create a sound simulation of a professional environment.

    The second goal was to study how classical test analysis could help to deter- mine the reliability of an interactive test. Firm conclusions about reliability can- not be drawn, because not all criteria for classical test analysis are met. In an interactive test students do not go through the same path or answer the same ques- tions. And moreover, the actions are often closely related to each other, Further research on interactive test analysis is needed, maybe in the direction of qualita- tive test analysis.

    The method of scoring on effectiveness and efficiency seemed to be an accurate one, together with the information on crucial mistakes. The reports yield a com- plete picture of how students perform. There is an indication that open actions, which students have to initiate, discriminate better between good and weak students than multiple choice and open questions.

    Current Situation

    At this moment, the hardware is too expensive to be bought by individual schools. Therefore, the hardware circulates between schools on a kind of lease contract. They can use the test environment for some weeks as training and testing material. It is an aim of the National Institute of Educational Measurement to develop a range of interactive tests for other vocational areas.


    1. The interactive video test runs on an IBM PC (or compatible) with MS-DOS operating system, a hard-disk (20Mb), a video-overlay card, a field-store card, a floppy drive, and a mouse. A laser vision player is connected for the stills and motion video (Betacam) with audio. A CD-ROM player with digital audio is used for all the audio feedback.


    CITO, National Institute of Educational Measurement. (1990). Beeldplanttoets voor npothekersassis- tenteen [Videodisc test for pharmaceutical chemists assistants]. Arnhem, The Netherlands: CITO. (Available from CITO, P.O. Box 1034,6801 MG Arnhem)

  • 62 Bosman, Hoogenboom, and W~dpor

    Cohen, V. B. (1984). Interactive features in the design of videodisc materials. E&~ational Technology, 24, 16-20.

    Gery, G. (1987). Making CBT happen. Boston: Weingarten. Ives, W. (1989). Making evaluation part of the training. Educational Technology, 29,50-S. Laurillard, D. (Ed.). (1987). Interactive media, working methods and practical npplications.

    Chichester, England: Ellis Hotwood. McGuire, C. H., Solomon, L. M., & Bashook, P. G. (1976). Construction and use of written simula-

    tions. The Psychological Corporation. Sales, G. C. (1989). Applications of videodiscs in education. The Computing Teacher, 16(8), 27-29.


View more >