THE DEVELOPMENT AND EVALUATION OF A DIAGNOSTIC …

THE DEVELOPMENT AND EVALUATION OF

A DIAGNOSTIC SYSTEM OF REMEDIATION

FOR AN AUTO-TUTORIAL COURSE IN

GENERAL COLLEGE CHEMISTRY

A Dissertation

for II'Ie Degree of DH. D.

MICHIGAN STATE UNIVERSITY

Gary William VanKempen

I977

L [B RA R Y

Michigan State

University

This is to certify that the

thesis entitled

THE DEVELOPMENT AND EVALUATION OF A

DIAGNOSTIC SYSTEM OF REMEDIATION



presented by

GARY WILLIAM VANKEMPEN

has been accepted towards fulfillment

of the requirements for

Ph . D . Chemis try anddegree in

Administration & Higher

Education

Major professor

Date ~Deceuber.10, 1976

0—7639

..

.y~-

.Winisms

ABSTRACT





BY


A unique system of diagnosis and remediation has

been developed for an auto-tutorial computer managed

course in general college chemistry. The system is based

upon an analysis of examination questions used in the

course to identify the kind or kinds of thinking required

by each question. A task analysis was used to identify

six important kinds of thinking which are: memorization,

translation, classification, visualization, reasoning

and reasoning with math.

The validity of these categories was tested by

examining the agreement obtained when several content

experts classified the questions independently. The

highest interclassifier agreement (over 90 percent) was

obtained for the memorization and reasoning categories.

For each of the other categories, the agreement was

approximately 75 percent. A further test of validity

compared the inter-item correlation coefficients between


pairs of questions each of the same kind of thinking and

pairs of questions in which the kinds of thinking were not

matched. The correlations from questions of the same kind

of thinking were significantly higher (a = .05) than

correlations for different kinds of thinking for the

memorization and reasoning with math categories.

An experimental remedial system was designed in

which students who scored below 60 percent on previous

examination questions received remediation based on the

kind of thinking in which they were most deficient. The

two categories on which the remediation was based were

memorization and reasoning with math. A distinction was

made between scores obtained from tests involving content

discussed in remediation (the initial learning score)

and scores obtained when the student was being introduced

to new material (the transfer score). The latter repre-

sents the transfer of training in a particular kind of

thinking to a new topic in the course.

Supplementary instructional materials and classes

were made available to students for the first four weeks

of a ten-week term. For both the memorization and the

reasoning with math categories, the experimental group

scored significantly higher than the control group on the

initial learning score but there was no significant

difference between the groups on the transfer score.

Thus remediation seems to improve performance on material


discussed during the remedial class, but the improvement

is not maintained when new material is introduced.





BY


A DISSERTATION

Submitted to

Michigan State University

in partial fulfillment of the requirements

for the degree of

DOCTOR OF PHILOSOPHY

Departments of Chemistry and Higher Education and Administration

1977

TO DORINDA

ii

ACKNOWLEDGMENTS

I would like to express my sincere thanks to Dr. Robert

N. Hammer for his guidance throughout my graduate study.

A special thanks is also extended to Dr. Ed Smith for

his friendship and for his assistance with this project.

Appreciation is also extended to Dr. Jack B. Kinsinger

who provided a much needed inspiration.

This work was supported in part by a grant from the

Educational Development Program of Michigan State Uni-

versity and a grant from the Alfred P. Sloan Foundation

administered through the College of Engineering at Michigan

State. Appreciation is extended to the Chemistry Depart-

ment of Michigan State University for supporting me during

the initial stage of my graduate study and providing me

with my first significant opportunity to teach.

Finally, I would like to thank my parents, Peter and

Manual VanKempen, and my wife, Dorinda, for their con—

tinued support and love.

iii

TABLE OF CONTENTS

Chapter Page

LIST OF TABLES . . . . . . . . . . . . . . . . . . . . vii

LIST OF FIGURES. . . . . . . . . . . . . . . . . . . .Viii

INTRODUCTION: AN OVERVIEW OF THE PROJECT. . . . . . . l

The Problem. . . . . . . . . . . . . . . . . . . . 1

Goals of the Project . . . . . . . . . . . . . . . 2

Research Questions . . . . . . . . . . . . . . . . 4

Generalizability of the Results. . . . . . . . . . 5

HISTORICAL: A REVIEW OF THE LITERATURE. . . . . . . . 6

Classifying the Outcomes of Education. . . . . . . 6

Gagne's Classification of Learning . . . . . . 8

Science Processes. . . . . . . . . . . . . . . ll

Empirical Support of Taxonomies. . . . . . . . l2

Diagnosis and Remediation. . . . . . . . . . . . . l4

Diagnosis Based on Piagetian Theory. . . . . . 15

The Effects of Diagnostics . . . . . . . . . . l8

EXPERIMENTAL METHODS AND PROCEDURES. . . . . . . . . . 22

The Instructional Setting. . . . . . . . . . . . . 22

The Computer Management System . . . . . . . . . . 23

The Classification Scheme. . . . . . . . . . . . . 25

Developing the Categories: A

Task Analysis. . . . . . . . . .f. . . . . . . 26

Kinds of Thinking. . . . . . . . . . . . . . . 27

Criteria and Procedures for

Characterizing Questions . . . . . . . . . . . 28

Validation of the Categories . . . . . . . . . . . 35

iv

Chapter

Reliability. . . . . . . . .

Agreement Among Classifiers.

An Analysis of Correlation

Coefficients . . . . . . . .

The Remedial System: Project CLIC . . . . .

Selecting the Sample . . . .

The Design . . . . . . . . .

The Treatment. . . . . . . .

Evaluation of the Remedial System. . . . . .

SUMMARY AND DISCUSSION

Overview of the Project. . . . .

The Validity of the Classification

System 0 O O O O O O O O O O I 0

Reliability. 0 O O O O O O O

Interclassifier Agreement. .

Analysis of Correlation Coefficients . .

Evaluation of Project CLIC . . .

IMPLICATIONS FOR FUTURE RESEARCH AND

A Chemical Education Laboratory.

The Difficulty Index . . . . . .

Alternative Validity Studies . .

Selecting the Sample . . . . . .

A Piagetian Classification . . .

REFERENCES . . . . . . . . . . . . .

APPENDIX A . . . . . . . . . . . . .

APPENDIX B . . . . . . . . . . . . .

APPENDIX C . . . . . . . . . . . . .

APPENDIX D O O O O O O O O O O O O 0

DEVELOPMENT

Page

36

38

39

47

47

48

49

50

55

56

56

57

60

63

69

69

71

73

74

75

78

80

93

95

99

LIST OF APPENDICES

Letter Referred to on Page Title

A 23 Fortran Programs

B 34 Characterization of

Questions

C 40 Calculating Grouped

Average Correlation

Coefficients

D 49 ‘ Outline of CLIC Material

Vi

LIST OF TABLES

Table

I. Interclassifier Agreement . . . . .

II. Averages and Standard Deviations

for Grouped Sets of Correlation

Coefficients. . . . . . . . . . . .

III. Planned Comparisons of Grouped

Correlation Coefficients. . . . . .

IV. Two-way Analysis of Variance of

Grouped Correlation Coefficients. .

V. T-test Evaluation of Project CLIC .

VI. A Summary of Interclassifier Agree-

ment 0 O O O O O O O O O O O O O O 0

vii

Page

40

43

44

46

S4

58

LIST OF FIGURES

Figure Page

1 A Two-way Analysis of Variance . . . . . 45

viii

INTRODUCTION: AN OVERVIEW OF THE PROJECT

The Problem

It is obvious to anyone teaching chemistry at the

introductory college level that many students find the

subject difficult to comprehend. Unfortunately, the

sources of this difficulty are much less obvious, as is

the remedy. The problem is compounded by the trend toward

less stringent admission requirements, the increasing

number of special programs for students who are seeking

post-secondary education, but do not meet standard admis-

sion requirements, and the large enrollments in courses

for non-majors which are prerequisite for courses in other

academic areas. College science teachers are being asked

to deal effectively with students who traditionally would

have been considered incapable of doing science, either

because of poor backgrounds in high school science or a

lack of sufficient basic abilities. These students, when

placed into courses with more capable students, present a

difficult problem for instructors. The more heterogeneous

a group of students becomes, the more difficult it is to

present materials and tasks which challenge students, but

represent reasonable expectations. There have been many

attempts to solve this problem through programmed instruc-

tion, personalized systems of instruction, or through

traditional remediation.

While there are advantages and disadvantages accompany-

ing each of these methods, one important advantage of re-

medial instruction is its supplementary nature. Any pro-

cedure for remediation can be developed as an addition

to--rather than a modification of—-an existing course.

Remedial instruction, however, can only be effective to the

extent that it focuses accurately on student learning

problems. The identification of these problems is a dif-

ficult task which usually requires a substantial amount of

student-teacher contact. Often the high enrollment and

emphasis on decreased cost prevents a great deal of the

type of contact between students and instructors which is

necessary for any kind of systematic look at student learn-

ing problems.

Goals of the Project

This project begins with the assumption that there

are a significant number of students in introductory fresh-

man chemistry courses who do not achieve at the level they

could because they have not developed some of the skills

and abilities necessary for success. We realize that there

are other factors, such as motivation, attitude, and prior

experience in studying chemistry which also affect a student's

achievement. We have, however, chosen to focus our atten-

tion on a specific set of abilities to be identified through

a study of student performance on course examinations.

This set of abilities will be divided into categories which

are referred to as "kinds of thinking". These kinds of

thinking are identified through a task analysis of the

questions typically asked on freshman chemistry exams.

From this task analysis, generalized task descriptions are

developed which are then converted into descriptions of

specific kinds of thinking. Each exam question can then

be characterized according to the kinds of thinking it

requires.

The specific goals of this project can be considered

in three parts:

a. To develop and validate a series of criteria by

which one can classify questions commonly asked

in general chemistry. This classification will

be based upon the kinds of thinking required to

answer each question.

b. To develop a system of diagnosis which identifies

students who are scoring poorly on a group of

questions which require the same kind of thinking.

c. To develop and evaluate remedial materials which

focus on particular kinds of thinking as they apply

to general chemistry.

The purpose of the criteria described in part "a'

above is to guide the characterization of test items in

terms of the kinds of thinking required to answer them.

Once questions are characterized in this manner, it should

be possible to group them into separate classes depending

upon whether the question does or does not require specific

kinds of thinking. Examination of student scores on a

particular class of questions may reveal a pattern of misses

which could be traced to a lack of ability to perform the

kind of thinking required. Remediation which focuses on

these thought processes as they relate to chemistry, might

then improve scores on this particular class of questions.

Research Questions

In general, the project can be viewed as a study of

student performance on examination questions in general

chemistry as a function of the kinds of thinking required

by the questions. To facilitate the discussion of the

research questions, the following definitions have been

established.

a. Class oquuestions: All questions which are classi-

fied as having a particular kind of thinking or

series of kinds of thinking in common.

b. Topic: Any specified content which is discussed

sequentially during a course.

c. Subtest score: The percentage correct responses

to a specified group of questions which are all

members of the same class of questions.

The project is designed to answer the following research

questions.

a. Does student performance on examination questions

support the categories proposed in this classifica-

tion scheme?

b. Do the remedial materials which focus on the kind

of thinking required in a specific class of questions

covering a particular topic in chemistry improve

student performance on that class of questions for

that particular topic?

c. Once students receive remediation on a particular

kind of thinking as it applies to general chemistry

in one topic, will there be any improvement in

their performance on questions requiring the same

kind of thinking but covering a different topic?

That is, is there transfer of training of a par-

ticular kind of thinking to a new content within

the course?

Generalizability of the Results

Since the content of general chemistry is fairly stan-

dard throughout the majority of freshman college courses

and many high school courses, we expect that the classi-

fication scheme will be useful to anyone teaching general

chemistry. Since the students involved are representative

college freshmen, the identified learning problems are

likely to be present in varying degrees in any general

chemistry course. The system of diagnosis is not neces—

sarily restricted to an individualized course, although

frequent examinations and careful record keeping are likely

to be essential. The success of the diagnostic system and

the supplementary instructional materials in improving

students' examination scores in general chemistry at Michigan

State will be a clear indication of their probable success

at other schools.

The classification of a question may be dependent on

the kind of instruction given in a course. That is, the

kind of thinking used by a student to answer a question

may depend upon the instruction received by the student.

HISTORICAL: A REVIEW OF THE LITERATURE

Classifying the Outcomes of Education

The interest in a classification of learning in terms

other than the specific content of a discipline has been

evident for decades. One of the earliest and most widely

used classifications of this type is Bloom's Taxonomy of

1 in which BloomEducational Objectives: Cognitive Domain,

organizes learning into categories labelled knowledge, com-

prehension, application, analysis, synthesis, and evaluation.

One very popular use of this taxonomy has been to describe

the content of achievement tests. Fast2 reported an analysis

of the American Chemical Society-National Science Teacher's

Association High School Chemistry Achievement Test in which

he found 40 percent of the questions to be in the knowledge

category, 25 percent each in comprehension and application

and 10 percent in the analysis category. He also discovered

a trend toward a higher percentage of application and

analysis questions during the period from 1957 to 1971.

Airaisian3 used Bloom's taxonomy to describe the objectives

of two chapters of Chemistry: An Experimental Science

according to Bloom's level of the objective. He found that

high school chemistry teachers could classify the objec-

tives with a 90 percent level of agreement. Airaisian

also determined that a majority of the objectives fell into

the knowledge class and required only recall of information.

4 used Bloom's taxonomy to analyze the cognitiveScott

levels of activities and exercises in a particular set of

instructional materials. He examined an early edition

of Science--A Process Approach5 and found that many activi-

ties required application behavior and some required anal-

ysis and synthesis. As the use of the taxonomy became

widespread, there developed a consensus that many textbooks

and standardized tests fail to provide enough tasks at the

higher levels of synthesis, evaluation and application.

This has lead in some cases to a decrease in emphasis on

tasks which require the student simply to recall informa—

tion. This situation illustrates the effect that a classi-

fication of the objectives of education can have on the

direction of curricular change.

One weakness of a thought process approach to ques-

tion classification is that these processes are only

inferential constructs. They cannot be observed directly.

One cannot assume that all students answer the same ques—

tions using the same cognitive processes. To help deal

with this problem, one should keep in mind the instruc-

tional material on which the questions being classified

are based. The chances may be greater that two students

will answer the same questions using the same processes

if they have been exposed to the same instructional mater—

ial. The classification of questions which is used in

'the present. project is not based upon a strict cognitive

process approach. That is, we do not intend to identify

in detail the information processing routines used by

students to answer questions. The procedure used in this

research is similar to a task analysis described by Smith6

in which tasks are described according to the characteris-

tics of the given information and the information which

the student is attempting to find. An assumption must be

made concerning the information which students bring to

a given task. For this research, the assumption will be

based upon a knowledge of the instructional material

presented to the student and the knowledge gained from

many years of experience observing how students perform

the required tasks of this freshman chemistry course.

Gagne's Classification of Learning

Another widely accepted classification of learning is

that described by Gagne. In his article on the domains

of learning,7 he categorizes learning processes into class-

es described as motor skills, verbal information, intellec—

tual skills, cognitive strategies, and attitudes. He

emphasizes that these kinds of learning can be found to

varying extents in all disciplines. He suggests that state-

ments concerning optimum learning conditions and methods

of testing which are apprOpriate for one of the domains of

learning may not be appropriate for another. In particu-

lary the instructional procedures which maximize learning

‘mithin one domain are different from those which best

encourage learning in a different domain. For example, the

nature of instruction designed to teach the learner to

perform an acid-base titration should certainly be dif-

ferent from instruction on the nomenclature of simple

molecules. In the former case one is teaching a motor

skill, while the latter involves intellectual skills and

verbal information.

Gagne also has emphasized that different assessment

techniques are required for objectives in different domains.

One does not test possession of verbal knowledge in the same

manner as he would test the possession of intellectual

skills. If this latter idea is correct, it should then

be possible to describe the domain or domains of learning

of specific test questions and, from scores on these ques-

tions, to evaluate student learning within these specific

domains. Although Gagne's categories are still best des-

cribed as inferential constructs, they have provided sig-

nificant guidelines for the design and evaluation of in-

struction. They have also greatly influenced the nature

of the classifications proposed in this study. Since the

categories that have been developed for the present diag-

:nostic system.can best be described as verbal information

and intellectual skills, these domains will be discussed

in more detail .

The learning of verbal information means that the

learner is able to state in declarative form what he has

10

learned. Verbal information comprises the facts, principles,

and generalizations which make up a large part of school

learning in any discipline. In introductory chemistry,

the student is asked to learn the chemical symbols of the

elements. This verbal information can be presented by

examining the relationships between the names of elements

and their symbols and giving the appropriate Latin or

German names. To test whether a student has learned these

symbols one usually asks that the name be declared from

the symbol, or vise-versa. The learning of verbal informa-

tion does not necessarily give the learner the ability to

apply that information to a novel situation.

Gagne describes intellectual skills as "knowing how

as contrasted with knowing that"8. A student learns to

convert fractions to decimals or to represent the electronic

configuration of the elements or the Lewis dot structures

of simple molecules. An intellectual skill is a learned

capability which enables the learner to perform a particular

group of tasks if he possesses the appropriate knowledge.

The lack of an intellectual skill may prevent a student from

performing a class of tasks in spite of his knowledge of

verbal information. Studies have shown that ability to

perform a specific task is greatly increased if the student

is taught the prerequisite skills.9 Gagne and Brownlo’11

investigated the learning of a task of constructing formulas

for the sums of number series. They identified the "sub-

Cufiinate skills" which were necessary to perform the task

11

and related these skills to one another in a hierarchy.

The hierarchy therefore represents the sequence of abilities

upon which the learner would rely in order to perform the

superordinate task. Seven students who were unable to

construct the formulas were tested on each of the sub-

ordinate skills and given instruction on the skills which

they could not perform. The final task was again presented

with verbal directions about how to do it but no additional

practice. Six out of the seven students were then able

to perform the final task. Thus, through an analysis of

the task and through appropriate instruction on subordinate

tasks, the experimenters were able to significantly increase

students' ability to perform the desired task.

Science Processes

An important classification of the objectives of science

education has been in terms of "science processes". A

science process can be described as a class of similar

tasks which scientists perform. These tasks include ob-

serving, comparing, classifying, quantifying, measuring,

experimenting, inferring, and predicting. The idea of

science processes is included in this review because the

philosophy behind the characterization of these processes

has influenced the conception of the present project. A

science process is a class of similar tasks which scientists

perform, while each category developed in this study is a

12

class of similar tasks which students are required to

perform. Exploring the nature of these classes of tasks

is an important step in determining the skills which are

necessary to perform them.

Guided mostly by the work of Schwab, efforts were made

to characterize the nature of what it is that scientists

do with the information that they obtain and how they go

about obtaining it. Despite Schwab's warnings, the science

teaching establishment accepted the notion that these

processes once identified, would prove to be common through-

out the various scientific disciplines. Although this did

not prove to be completely true, the notion that there

are certain important abilities which are essential com-

ponents of many science disciplines has remained. Research

in this area has focused on the measurement of student per—

formance of these process objectives, relating this per-

formance to overall course achievement, and studying the

nature of the processes themselves.

Empirical Support of Taxonomies

A serious criticism raised against the taxonomies which

have been described is that the abilities suggested in

these categories are not reflected in measurements of

12 studiedstudent performance. For example, Tannenbaum

science processes by means of an empirical test which he

designed. The science processes which he studied include

13

observing, comparing, classifying, quantifying, measuring,

experimenting, inferring, and predicting. His test included

96 items chosen from several of the natural sciences. The

author suggests (but does not substantiate) that student

performance on the test should not depend upon the distribu-

tion of questions among the various disciplines. Tannen-

baum used textbooks and various research reports to develop

a list of behaviors which science students are expected

to exhibit. These behaviors were then classified according

to the eight categories listed above. The test was admin-

istered and several statistical procedures were applied

to the results. The author reports overall test reliability

as well as subtest reliability, which is the reliability of

a group of questions from only one of the categories men-

tioned. For four of the eight processes studied, the sub-

test reliabilities were not significantly greater than that

expected of a random sample of the corresponding number of

questions from the entire test. A factor analysis identified

only one general factor which accounted for about 50% of

the variance in the scores.

In an empirical study of the hierarchical nature of

Bloom's taxonomy, Stedman13 found no significant difference

in scores for questions identified as knowledge and compre—

hension or application and analysis. There was however

a significant difference between scores on questions from

the comprehension and application categories.

14

The failure of many of these taxonomic categories

of objectives to be validated in empirical studies may

in part be due to their general nature. In this project,

the categories are based upon an analysis of the ques-

tions typically asked in a general chemistry sequence.

We hope that these categories-—which are designed not

only for a particular discipline but for a particular

group of courses within that discipline~-may be more easily

validated and consequently may provide more useful infor—

mation about student learning than general categories.

Diagnosis and Remediation

There have been a number of projects reported which

are in some ways similar to the present project. This

review is limited to diagnostic and remedial systems

used in high school chemistry and physics courses as

well as in college science courses because the informa-

tion from these areas is the most generalizable to the

freshman college chemistry course used as the laboratory

for the present research.

For the purposes of this study, remediation is defined

as an attempt to supply the thinking skills prerequisite

to a particular learning task when those prerequisites are

not part of the subject matter of the course. Furthermore,

the prerequisites are defined as a set of intellectual skills

which are applied to the tasks of college level general

15

chemistry. Since a vast literature exists on the effects

of various remedial treatments, it seems appropriate to

consider only the effect of that part of remediation con—

cerned with thinking skills assumed to he possessed by most

college freshmen and which are applied throughout most

introductory science courses.

Diagnosis Based on Piagetian Theory

Recent attempts to apply Piagetian theory to college

students has produced some unexpected results. According

14to Piaget, we each pass through four distinct periods

of intellectual development as we mature. There are:

l. Sensori-motor (0-2 years);

2. Preoperational thought (2-7 years);

3. Concrete operations (7-11 years);

4. Formal operations (11-15 years).

Research in this area has focused on (1) the develop-

ment of tests on an individual's stage of intellectual

development and (2) the mechanism by which one advances

to a higher stage and the effect of instructional experiences

on that advancement.15 For example, in a study reported

by Bredderman,16 the effect of training fifth and sixth

grade students to control variables was measured by ad—

ministering pre and post tests involving the control of

several different variables. According to Piagetian

16

theory, individuals who are not yet in the formal stage

of intellectual development will be unable to perform

this task. Bredderman demonstrated a significant but

small improvement in students' ability to perform these

tasks as the result of training. There were some students

for whom the training had no effect as demonstrated by

their pretest and posttest scores. These results are

typical of the Piagetian training studies reviewed by

Beilin.15

The application of Piagetian tasks to college students

has revealed that many college freshmen do not demonstrate

an ability to think at the formal level in some situations.

17 states that in a study at Oklahoma City Univer-McKinnon

sity 50 percent of 143 college freshmen failed to perform

tasks requiring thinking at the formal level. In a study

18by Renner and Lawson, only 22 percent of a sample of

college freshmen were judged to be in the formal stage

19 and Renner20of development. Studies by Griffith

have produced similar results.

The question which remains unanswered is whether the

results with college students indicate a general develOp-

mental retardation on the part of a vast majority of

college freshmen or an inability of these students to

think at the formal level in specific situations. An

interpretation of these results which is not in conflict

with Piagetian theory is that many students who have

advanced to the formal stage of intellectual development

17

do not always demonstrate their ability to think at this

level in all situations. This may be due to the particular

content involved in the Piagetian task, the effect of test

anxiety, the student's habit of reverting to concrete think-

ing in certain situations, or the lack of validity of the

particular test. Support of this interpretation comes

20 with children from gradesfrom a study by Danner and Day

five to twelve. Initially, 50% of the older subjects and

none of the younger subjects were able to perform tasks

which require formal operations. After a few prompts,

nearly all of the older subjects and a few of the younger

subjects were able to perform at the formal level on a

different task.

Renner and Lawson also found that high school chemistry,

biology and physics students scored significantly higher

than the general population of high school students. This

may be the result of a selection of science courses by

formal thinkers, or is the result of the practice one

can get in thinking at the formal level in a typical

science class, or both. The evidence does indicate

however that many freshman college students do not demon—

strate an ability to think at the formal level.

Few would deny that college chemistry and physics are

taught at the formal level. Students are required to deal

with relationships between variables and to have an under—

standing of how these relationships are developed and

tested. If students taking these courses are not in the

18

habit of thinking at the formal level, they will experience

a great deal of difficulty mastering the materials in

these courses. Assuming that these students could be

identified, they could be provided with instruction de-

signed to increase their tendency to use formal operations

in their study of science. The optimum nature of this

instruction is, at this time, unknown and is a question

which deserves some serious attention.

The Effects of Diagnostics

Lawler has investigated the effects of a diagnostic

system which identifies for each student the objectives

which he has failed to master in a health sciences course

21 Exams were taken on anat the freshman college level.

IBM 1500 terminal designed for computer assisted instruc-

tion. Students who received this diagnostic information

showed greater achievement as measured by the course final

exam than students who did not. Apparently the diagnostic

information helped students focus on objectives which they

were unable to perform.

In a freshman chemistry course at the University of

22 a diagnostic system called CHEM TIPS hasWisconsin,

been established in which students take a once-a-week

survey which requires them to demonstrate knowledge of

recent course material. The responses are computer

analyzed according to a set of predetermined criteria.

19

Those students who miss particular questions or groups of

questions receive computer generated messages indicating

topics they should work on, textbook page numbers where

this material can be found, and times during the week when

help sessions dealing with this material will be held.

Survey results were given to the teaching assistants so

they could work on this material during recitation sections.

In an attempt to evaluate the effectiveness of this system,

one of two lecture sections taught by the same instructor

was given the option of taking the CHEM TIPS survey while

the other was not. Enrollments were 163 in the experimental

group and 167 in the control group. There was in this

case no significant difference between the average scores

on three course examinations for the two groups. There

was, however, a difference in the attitudes of students

toward their teaching assistants. The experimental group

responded more favorably to questions concerning the

teaching assistant's interest in student's progress, ef-

fectiveness as a teacher and ability to answer questions.

This could be the result of the teaching assistants bene-

fiting from the information provided by the CHEM TIPS

survey.

Riban23 has studied a diagnostic system which identi-

fies deficiencies in mathematical abilities based upon

patterns of correct and incorrect solutions of physics

problems by high school physics students. The mathematical

abilities were established through an analysis of the

20

skills required for the solution of each problem. One

hundred sixty three separate mathematical abilities were

identified. This list was decreased to 42 by rejection

of those abilities believed to be present in all students

as well as those required by only a few problems. The

decision to enter remediation for a specific ability was

based upon the percentage of missed questions which required

a specific ability. Students diagnosed as deficient in a

particular mathematical ability were randomly divided into

control and experimental groups. The experimental group

received the appropriate programmed remediation. There was

no significant difference between these two groups as

measured by scores on two subsequent Physical Science Study

Commission achievement tests. The author did not present

a breakdown of student scores on groups of questions requir-

ing the remediated ability. Such a breakdown would reveal

whether the remediation actually improved achievement in

some areas while decreasing achievement in others. Also

the effect on the total test score might be so small that

it is masked by other sources of variance. A test given

just before remediation revealed that about half the stu—

dents diagnosed to be deficient in a particular mathematical

ability could perform physics tasks which require that

ability. Thus, assuming the validity of this test, many

students apparently were misdiagnosed. With 42 separate

abilities being tested, it seems apparent that the decision

to enter remediation must have been based on a small number

21

of questions. Frequently, a particular ability occurred

in only one small segment of the course content; it was

therefore impossible to check performance in that ability

throughout the course. In the present study, we hope to

avoid these two problems by defining the kinds of thinking

in a manner which will yield a small number of more general

abilities which occur throughout the topics of the course.

24 has reported the effect of remediation in aLarkin

college physics course. The remedial instruction involved

how to apply relationships in physics. In the first part

of the course, a randomly selected group of students re—

ceived instruction in how to identify relationships, the

important characteristics of a relationship, and how to

demonstrate knowledge of a relationship. Her study showed

that students who received this instruction were better

able to acquire an understanding of the relationships

which they encountered throughout the course.

10'11 has demonstrated theAs discussed earlier, Gagne

effect of remediation when that remediation has been linked

to a task analysis of the desired learning.

The project also attempts to apply a task analysis,

but the application is made on the objectives of an entire

course rather than on one specific task. For this reason,

the subtasks identified are stated in general terms accord-

ing to the criteria mentioned earlier. We hope that pro-

viding students with some general intellectual skills will

improve their performance on tasks which require those

skills.

EXPERIMENTAL METHODS AND PROCEDURES

The Instructional Setting

Before describing the design and procedures which were

used to answer the research questions raised in the first

chapter, it is necessary to outline the format of the

courses to which the project was applied.

The Chemistry Department at Michigan State University

has transformed the first two courses of one of its intro—

ductory chemistry sequences from the traditional lecture-

recitation format to a modular self-instructional mode.25

Most students who take these two courses (CEM 130 and CEM

131) are not chemistry majors but are required by their

major area to take an introductory chemistry sequence.

The primary instruction in these courses is contained

in a series of audio cassettes and accompanying work—

books. The workbooks contain diagrams and examples to which

the students refer as they listen to the tape. Although

the students' pace through the course is somewhat flexible,

they are required to finish specified amounts of material

within two week periods. The course can be described as

26-28 since students are permitteda modified mastery approach

(within a designated time frame) to take examinations as

often as once a day until they are satisfied with the grade

that they have earned. A criterion referenced grading

22

23

scale is used so that students at any point during the course

can predict their course grade from performance on course

examinations. Thus, a student could make a contract (with

himself) for a particular course grade and repeat each exam-

ination until it is passed at a level which would translate

into the grade desired.

The alternate forms of examinations are generated by

computer from a bank of over 4000 questions. Each question

has a library number which indicates the associated unit

of the course and the concept or idea which it is testing.

Each fifteen item test contains a specified number of ques-

tions from each of the units covered by that exam.

Supplementary instruction is provided by graduate (or

occasionally undergraduate) student instructors who staff

a "Help Room" which is open to students during daytime and

evening hours. While this system allows students to get

their questions answered at any time, it does not foster

the kind of student instructor contact which is necessary

to identify and remedy any systematic problems that students

may have. Thus, one of the goals of this project was to

produce a system of diagnosis and remediation which would

effectively deal with student learning problems.

The Computer Management System

An important part of this remedial system is the com-

puter management system which identifies students who are

24

scoring poorly on a particular class of questions. This

section will provide a detailed description of the technology

that has been developed.

A subtest score is defined as the percent correct res-

ponses to a specified group of questions which are all members

of the same class of questions. Each fifteen item examina-

tion (an exam form) in Chemistry 130 and 131 has associated

with it an exam composition index (ECI) which is a list of

master file library numbers of questions chosen for that

exam form. In order to create a subtest score, the classi-

fication of each of the questions is first added to the ECI

and stored on a disk file. The EDITOR subroutine (Appendix

A) is then used to pick out all of the questions requiring

the memorization of a property, one would scan the appropriate

columns of the file for the MP designation and create a new

file with only these questions in it. Additional file modi-

fication can be done by deleting questions having undesirable

characteristics. In this study, we used those questions which

required only memorization for the memorization class and

deleted all questions with additional or alternative classi-

fications. A file of questions for the reasoning with math

group was created in a similar fashion.

Students mark their answers to exam questions on machine

scorable answer sheets. Fill-in questions are graded by hand

and the graders mark the appropriate boxes for correct and

incorrect answers. A special information sheet which is

25

prepared for each exam form, indicates the exam form num-

ber, the course number, and the time and date the exam was

administered. In the scoring process, the information is

transferred to magnetic tape and eventually to a disk

file. Each record on the disk file contains the student

number, the exam form number, the number of questions out

of fifteen which the student has answered correctly, and

the student's performance on each question. If the student

answered a question correctly, a "l" is recorded in the

column representing that question. If the question was

answered incorrectly, the letter corresponding to the

distractor chosen is recorded. Finally, a subtest score

is calculated by determining the percentage of correct

responses for questions of a particular class. The FORTRAN

programs which have been written to perform this analysis

are listed in Appendix A- The ANNOVA and FACTOR programs

were written for use with the Statistical Package for the

27Social Sciences (SPSS) subroutines and are also listed

in Appendix A-

The Classification Scheme

The diagnostic system which has been developed is based

upon a classification of examination questions in terms of

the kind of thinking required to answer them. The first

part of this section will describe the process by which

26

the categories were established. The characteristics

of each of the categories will then be described, and

finally the method of creating classes of questions based

on the classification scheme will be discussed.

Developing the Categories: A Task Analysis

In their book on Learning System Design, Alexander,

Yelon, and Davis29 describe a task analysis as a detailed

description of how a particular task is to be performed.

It takes into account the entry skills of the learner,

the type of learning involved, and the particular condi-

tions or constraints in the instructional environment which

influence the learning process. The process of develop-

ing categories for the present classification scheme began

with this type of task analysis. From each of the questions

used in Chemistry 130-131 a generalized description of the

task to be performed was developed. In each case the task

was described without reference to any of the chemistry

content involved. As the analysis proceeded, it became

apparent that a small number of these generalized task

descriptions were emerging as the most important types of

tasks required of students in these courses. Furthermore,

the general task would appear many times throughout the

various topics of each course. From each of these task

27

descriptions was developed a statement of a "kind of think-

ing" which is required of students taking introductory

chemistry. Six major kinds of thinking were identified,

some having several subcategories.

Once these kinds of thinking were identified, each ques-

tion was characterized according to whether it does or does

not require a particular kind of thinking. The criteria

for this characterization and a description of these kinds

of thinking are given in the next sections.

Kinds of Thinking

The various kinds of thinking required by questions

typically asked in a general chemistry course are listed

below.

I. M-Recalling Memorized Information

1. Mp — recalling memorized properties

2. Mr - recalling memorized relationships

II. C—Classification, Discrimination, Pattern Recogni-

tion.

Identifying an entity as a member or a nonmember

of a class without being given the criteria of

that identification.

III. V-Visualization

Forming an image of an object or set of objects

which are static, do not require the recognition

of color and have been seen by the student. The

following designations are added to the V in the

situations indicated.

d - if the object(s) is dynamic

c - if the color of the object(s) is required

n - if the object has not been seen but has been

described.

28

IV. ~Translation

Twm - between words and math

T

l.

2. Tcw - between words and chemical symbols

3. Tom - between math and chemical symbols

V. R—Reasoning

The sequencing or combining of ideas to derive

or evaluate new ideas.

Rs - The sequencing or combining of any of the

processes which have been listed in this

classification.

R1 - A one step reasoning process in which a single

relationship is applied to known properties

to determine an unknown property.

R2 - A more than one step reasoning process in

which a number of relationships are applied

in sequence.

R3 - A process which involves a series of reasoning

steps which have been described to students

in an algorithm.

VI. Mt-Math

Me - Working with numbers expressed in exponential

notation or as logarithms.

Ml - Manipulating linear algebraic equations.

M2 - Manipulating nonlinear equations.

A more detailed explanation of the criteria by which one

can identify the kinds of thinking required by a question

is given in the following section.

Criteria and Procedures for

Characterizing Questions

The following is a description of the criteria which

are used to decide whether or not a question requires a

29

particular kind of thinking.

I. Memorization: This category includes questions

which require students to recall memorized in-

formation. This information can be classified

into two types.

1. Properties: A property is one specific

characteristic of an entity. This character-

istic is specified through variables (mass,

height, color) and their corresponding values

(3 grams, 3 feet, red).

Sample question: "What is the precision of

an analytical balance?"

The property of the balance which is to be

specified is its precision. The value required

is 0.0001 gram.

Relationships: Any statement which defines

thé dependence of one property on other

properties is called a relationship.

Sample question: "When a block of solid is

dropped into an insulated

beaker of liquid, the heat

lost by the substance orig-

inally at the higher tempera-

ture is:

Answer: equal to the heat gained by

the substance at the lower

temperature.

This question requires the student to recall

the relationship between the heat lost by an

object and the heat gained by its surround-

ings. This interpretation of the question

assumes that the entity being considered is the

block of solid and the important properties

of that entity are heat lost and heat gained.

An alternative interpretation assumes the

entity under discussion is the heat lost by

that object and a property of the heat lost

is that it is equal to the heat gained. In

this and similar cases the interpretation

which leads to the classification of the

information as a relationship will take

precedence.

II.

III.

30

Classification, Discrimination, and Pattern Recog—

nition

The process of evaluating information for the

purpose of grouping.

The question must require the student to identify

an entity as a member or nonmember of a class

without being given, in the question, the criteria

of class membership.

Sample question: "Which of the following series

of elements contains only non-

metals?"

Most students in introductory chemistry would use

the position of the element on the periodic chart

to determine whether each element was a member

of the class of nonmetals. At an advanced stage,

one is able to classify by recognizing patterns

of stimuli. For instance, one can usually tell

that an object is a chair without examining its

properties one by one. Classification also re-

quires the consideration of the properties assoc-

iated with a particular class and will usually

require an Mp process. The Mp designation will

be assumed to be a part of the C designation,

and therefore is not listed with it.

Visualization

Questions will be characterized as involving

visualization if they require the student to form

an image of a static object which has been seen

and for which the recognition of color is not

important.

If the student must move the object in his mind,

a Q will be added to indicate a dynamic object.

If the question requires the student to remember

color, a 2 will be added.

If the question requires the student to form an

image of an object which has been described but

not seen by the student, an n will be added.

Sample question: 1. If in a particular course

students are shown samples of one

mole of various substances, the

question "At room temperature

the volume occupied by one mole

IV.

31

of water is about . . .", would

be classified as V since the

sample was static and its color

was unimportant.

2. "A cube has how many four-

fold axes of symmetry?" This

question would be classified

as involving Vd since it re-

quires the student to rotate

the object in his mind.

Translation

Questions which require the students to interpret

from one language to another will be characterized

as involving translation. There are three possible

subcategories.

l. Twm - translation between words and math

For example: "Given a = b + c, produce a

is equal to b plus c."

2. Tcw - translation between chemical symbols and

words

For example: "Given K + 02 = K02, write one

mole of potassium reacts with one mole of

... etc."

3. Tcm - translation between chemical symbols

and math

For example: "Given N203 state the ratio

of oxygen atoms to nitrogen atoms in the

molecule."

Reasoning

The combining or sequencing of ideas to derive

or evaluate new ideas.

1. Rs - The combining or sequencing of the kinds

of thinking which have been characterized in

this classification scheme.

Sample question: 1. "A tetrahedron has how

many three-fold axes? The

complete characterization of

this question would be Rs

(V,Mp,Vd). The letters in

parenthesis indicate the

kinds of thinking being se—

quenced. In this question

the student must first

32

visualize a tetrahedron, then re-

call the definition of a three-

fold axis, and finally rotate

the image to determine how many

three-fold axes are present.

R1 - A one step reasoning process in which a

single relationship is applied to known prop-

erties to determine an unknown property.

Sample question:

Answer:

1. "If the density of a 4

cubic centimeter block of

metal is 4 g/cc. What is

its mass?" This question re-

quires the use of the rela-

tionship between density,

mass, and volume to determine

the mass of an object, given

its density and volume.

2. "Two metal samples each

contain exactly the same

chemical elements. Emission

spectra show lines at exactly

the same wavelengths. One

can conclude:

Each of the two samples contain

exactly the same chemical

elements." This question

requires the use of the fact

that each element has a

unique emission spectrum to

determine that two samples

which produce exactly the

same emission spectra must

be composed of exactly the

same elements.

R2 - A multistep process which requires the ap-

plication of two or more relationships without

simply applying a learned algorithm.

Sample question: 1. "What is the density of a

4 9 cube of metal with 3 centi-

meter edges?" The question

requires the combination of

two relationships; d = m/v

and v = e3.

2. "If an atom and an ion

contain the same number of

electrons, then they:

VII.

1.

33

Answer: must be of different elements.

The question involves realiz-

ing that an atom and an ion

have a different charge and

that the charge is equal to

the number of protons minus

the number of electrons.

Therefore the number of

protons must be different.

Since nuclei with different

numbers of protons are of

different elements, the atom

and ion in question must be

of different elements. In

this sequence, a number of

principles have been applied

to the given information to

determine which of the given

statements is correct.

R3 - A process which involves a series of reason-

ing steps which have been described to students

in an algorithm.

Specific tasks are common and complex enough

that they are often taught through the presenta-

tion of an algorithm or step by step procedure.

For example, one usually outlines the procedure

for finding the percent composition from a

molecular formula. Students are expected to

solve these problems by applying the proper

procedure to them.

Obviously, any question to which an algorithm

has been applied could be answered in the ab-

sence of the algorithm by combining the necessary

ideas. Thus an appropriate alternative classi-

fication for these types of question is Rs.

It is also important to note the characteristics

of the steps in the algorithm. This will be

done by including in parenthesis after the R3

designation the symbols for kinds of thinking

involved. '

Math

The mathematics which is required is divided into

three categories.

Me - scientific notation, exponents, and

logarithms

34

This category will include any question which

requires the manipulation of exponents or

logarithms, or numbers written in scientific

notation.

2. M1 - algebra with linear equations

All questions requiring the manipulation of

linear algebraic equations will be classified

as M1.

3. M2 - algebra with quadratic or higher power

equations.

Sample question: "What is the wavelength of

an electromagnetic wave having a frequency of

104 cycles/sec?" This question requires the

solving of the equation relating wavelength to

frequency (Ml)i and also the manipulation of

104 and 3 x 10 0 (Me).

Using this set of criteria, one can characterize examina—

tion questions according to the kinds of thinking involved.

In this project, the characterization was performed by

asking, "What method would be used by the majority of stu-

dents in CEM 130 and CEM 131 to answer this question?"

There are two important considerations. First of all, an

estimation of student performance was used rather than an

analysis of how a trained chemist might solve a particular

problem. Second, the instructional setting was carefully

considered. It was felt that the methods students use to

work problems and answer questions will depend upon the

information students bring to the problem. This obviously

will be a function of the instruction which the students

have received.

The characterization of the questions asked in CEM 130-

131 would often produce important combinations or sequences

35

of kinds of thinking which could then be considered as a

unique class of questions. For example, a very common se-

quence is Rl with M1 (a one step application of a mathematical

relationship). This combination appeared often enough that

it was given a special designation (R1(Ml)) and was considered

as a distinct kind of thinking. There were also questions for

which there were two alternate methods of solution commonly

used by students. For these questions, a slash (/) was used

to indicate that a question required one kind of thinking or

another kind of thinking depending upon the method a student

chose to work the problem. For example, the designation

Mr/Rl would be used for questions which some students answer

by recalling a memorized relationship and other students

answer by applying a relationship in a one step reasoning

process. To help illustrate the characterization procedure,

some sample questions and their classification are given in

Appendix B .

Validation of the Categories

As discussed earlier, the process of characterizing ques-

tions produces a class of questions which is used as an

instrument to measure a student's ability to perform a

particular kind of thinking. Whenever a new instrument is

developed, it should be accompanied by evidence concerning

its ability to measure performance accurately and precisely.

The usual procedure is to supply data concerning the

36

reliability and validity of the test.

Reliability

One measure of the reliability of a test is the correla-

tion between scores on two tests which attempt to measure

the same thing. Often the two tests are obtained by ar-

bitrarily dividing a test into two parallel forms of equal

length and measuring scores on each half of the test. A

very popular measure of reliability is the Kuder-Richardson

Formula 21 (K.R.21)30 which essentially creates all possible

parallel forms of an examination and averages the correla-

tions between them.

Since the examinations used in CEM 130-131 are very

carefully designed to include an even distribution of ques-

tions from a rather wide range of tOpics, the correlation

between arbitrary split halves of the test would not be

expected to be very high. The correlation between scores

on two equivalent forms of an examination was therefore used

as a measure of the reproducibility of the test scores and

hence the reliability of the questions.

In a previous study of CEM 130-131,31 thirty item examina-

tions were prepared by combining two alternate forms of the

usual fifteen item exams. The items were mixed thoroughly

to avoid fatigue or time limit factors. Students were told

that the score on each form would be computed individually

and they would receive the highest of the two scores. The

37

Pearson product-moment correlation coefficient between

these two scores was calculated as follows.

n zxi'zyir = Z N (1)

i=1

where

x.-§£

2xi = 1

0x

and

Yi-Y

zyi = 0Y

x and y are the scores obtained on alternate forms.

When this analysis was performed on approximately thirty

sets of examinations, the average correlation coefficient

obtained was 0.69.

In order to obtain some measure of the reliability inde-

pendent of test length, the Spearman Brown formula (Equation

2) was used to estimate the reliability of these tests if

they were composed of more items

nr

_ s

rn _ (n-l)rs+l (2)

This relation is used to calculate the reliability (rn)

of a test with n times as many items as a shorter test of

known reliability (rs).

38

Many nationally used tests of educational achievement

containing over 100 items report reliabilities between

0.90 and 0.95. If the tests used in CEM 130-131 were lengthen-

ed to 75 items, the calculated reliability would be 0.92

which compares favorably with standard educational achieve-

ment tests.

Agreement Among Classifiers

Another important measure of the validity of the classi-.

fication scheme is the agreement obtained when a number of

individuals who are familiar with the content of the course

attempt to classify questions. In one test of this inter-

classifier agreement, three chemistry faculty and the author

classified a group of questions independently. In another

test, an undergraduate teaching assistant and the author

compared their classification of questions. In both cases,

the percentage agreement was calculated as the number of

classifiers who agreed on the classification of a question

divided by the number of classifiers and then multiplied by

100%. For example, if three of four individuals classify

a question as memorization and the other as reasoning,

a tally of 3 out of 4 would be assigned to the memori-

zation class and no tally would be made for the reasoning

class. If two of four classified the question as reason-

ing and the other two as memorization, a tally of 2 out

of 4 would be added to each group. If for a different

39

question, two classifiers identified the question as reason—

ing and math and two identified it as memorization and

math, the math category would receive a tally of 4 out of

4 and the other two categories 2 out of 4. The results

of this analysis are shown in Table I.

An Analysis of Correlation Coefficients

Campbell and Fiske32 have described a set of procedures

for the validation of tests of individual differences. The

validation is based upon an analysis of the correlation

between scores on tests which are supposed to measure the

same trait, compared to the correlations between scores

on tests which measure different traits. A "multitrait-

multimethod" matrix is created which groups correlations

according to the trait being measured and the method used

to measure that trait. If the tests were valid, one would

expect that the correlations between tests measuring the

same trait would be greater than the correlations between

tests measuring different traits.

In the present study, an item which is characterized

as requiring a particular kind of thinking can be thought

of as a one item test of the student's ability to perform

that kind of thinking. One would expect that the correla-

tion between items of the same kind of thinking would be

greater than that for items requiring different kinds of

thinking. Obviously, there are many other factors control-

ing student performance on a particular examination question.

Table I. Interclassifier Agreement

40

Number of

Kind of Identical Total

Thinking Classifications Possible Percentage

Memorization 176 190 92.6%

Reasoning 78 82 95.1%

Translation 50 66 75.8%

Classification 30 40 75.0%

Visualization 44 58 75.9%

Math 42 54 77.8%

41

One of the most important of these is the topic from which

the item was chosen. In this study, the Pearson product—

moment correlations from selected fifteen item tests were

separated into four groups as shown below.

Group A: Coefficients between items from a given

class of question which are related to the same

topic.

Group B: Coefficients between items from different

classes taken from the same topic.

Group C: Coefficients between items from the same

class but from different topics.

Group D: Coefficients between items from different

classes and from different topics.

The individual fifteen item tests were selected for this

analysis if they produced an approximately equal number of

Pearson product moment correlation coefficients in each of

the groups listed above. For every test, an average of cor-

relation coefficients for each of the four groups was ob-

tained. Thus, every fifteen item test yielded four values,

each being an average of correlation coefficients from

Groups A through D. A description of the procedure used

in calculating the coefficients, and a list of tOpics

used are given in Appendix C. These values were then used

as the data for an analysis of variance. For each of the

six major classes of questions, the following hypotheses

were tested at the a = .05 level.

42

Hypothesis 1. The average of correlation coefficients

for group A will be higher than the average for group

B.

Hypothesis 2. The average of correlation coefficients for

group C will be greater than the average for group D.

Hypothesis 3. The average correlation between items from

the same topic will be higher than the average correla-

tion between items from different tOpics.

Hypothesis 4. The average correlation between items of

the same class will be greater than the average cor-

relation between items from different classes.

Hypotheses l and 2 were tested with a planned comparison

analysis of variance contrasting group A vs. group B and

group C vs. group D using the average correlation co-

efficients as the input data. The analysis was performed

on each of the six kinds of thinking except the category

classification, since there were not enough questions in

this category to produce any useful information concerning

its validity.

The averages, standard deviations, and the number of

tests used for the four groups of correlation coefficients

for each kind of thinking tested are shown in Table II.

The results of the planned comparison analysis are

shown in Table III. Hypothesis 1 was supported for the

reasoning with math category and was not supported for the

other four categories. Hypothesis 2 was supported for the

reasoning with math and the memorization category but was

43

Table II. Averages and Standard Deviations for Grouped

Sets of Correlation Coefficients.

Number Standard

Kind of Thinking of Tests Group Average Deviation

Memorization 6 A .1213 .0255

B .0970 .0268

C .0970 .0266

D .0648 .0149

Visualization 6 A .1217 .0569

B .1158 .0503

c .1163 .0479

D .0763 .0420

Translation 10 A .1411 .0281

B .1472 .0245

C .1234 .0256

D .1063 .0263

Reasoning 6 A .1358 .0281

B .1151 .0339

C .1007 .0272

D .0824 .0254

Reasoning with 9 A .1729 .0366

”at“ a .1040 .0432

C .1206 .0241

D .0586 .0160

44

Table III. Planned Comparisons of Grouped Correlation

Coefficients.

Kind of Thinking Contrasts T Value P Less Than

Memorization A vs B 2.03 0.084

C vs D 2.69 0.012

Visualization A vs B .206 0.84

C vs D 1.40 0.15

Translation A vs B -.521 0.61

C vs D 1.46 0.15

Reasoning A vs B 1.24 0.23

C vs D 1.10 0.29

Reasoning A vs B 4.60 .001

with Math

C vs D 4.14 .001

45

not supported for the other three categories.

To test hypotheses 3 and 4, a two way analysis of var-

iance was used. This analysis is illustrated in Figure l.

Topic

Same Not Same

Same Group A Group C

Class

L.,_._1:I?t....-_s.ém§,ML.--6.179.“) B . l. 939“" ”MIMI

Figure 1. A Two-way Analysis of Variance.

By combining groups A and B into one group, and groups

C and D into another, one can compare directly the correla-

tions between items from the same topic and items from dif-

ferent tOpics. The results of this analysis are then used

as a test of hypothesis 3. Similarly by combining groups A

and C into one group and groups B and D into another, one

can compare correlations between items from the same class

to correlations between items of different classes and

thus perform a test of hypothesis 4. The results of this

analysis are shown in Table IV.

The values listed in the column labelled "Matched"

are the means of correlation coefficients between items of

either the same tOpic or the same class. In the "Not Matched"

column are means of correlation coefficients between items

from different topics or classes.

Table

IV.

Two-way

Analysis

of

Variance

of

Grouped

Correlation

Coefficients

Mean

Correlation

Coefficients

Kind

of

Thinking

Main

Effects

Matched

Not

Matched

fP

Less

Than

Memorization

Topic

.109

.081

8.32

.009*

Class

.109

.081

8.32

.009*

Visualization

Topic

.119.

.096

1.23

.281

Class

.119

.096

1.28

..270

Translation

Topic

.144

.119

12.6

.008*

Class

.132

.127

.443

.999

Reasoning

Topic

.125

.092

12.5

.002*

Class

.118

.098

4.1

.049*

Reasoning

Topic

.138

.090

16.5

.008*

”1th

math

Class

.147

.081

29.6

.001*

46

*Significant

at

a=

.05.

47

As shown in this table, there is a class main effect

for the categories Memorization, Reasoning and Reasoning

with Math. There were no significant two-way interactions

found.

The Remedial System: Project CLIC

Selecting the Sample

During Spring Term 1976, an experimental remedial class

called CLIC (Comprehensive Learning in Chemistry) was pro-

vided for selected students in Chemistry 131. The students

were chosen on the basis of their performance in Chemistry

130 during the previous term. Students who scored below

60 percent on tests in CEM 130 were placed in the group cor-

responding to the class of questions for which they received

the lowest score. Students who failed Chemistry 130 were

not invited since they would not be taking Chemistry 131.

Of the 120 students invited, 30 indicated that they were

not going to enroll in Chemistry 131. Of the remaining 90,

70 participated in the project to some extent, 45 completed

all segments of the remedial class, and 54 missed less than

one of the three classes and one of the three tapes. This

group of 54 students formed the sample for this study. In

order to keep the sample size as large as possible, there

were three bonus points (out of a possible 90) offered to

students who participated in the project. This created

48

an additional incentive which most likely would not be

available to students in the ongoing operation of the

remedial system.

The Design

Students in each group were placed into control and

experimental groups. The control group in each case was

given remediation corresponding to the kind of thinking

for which these students were lgggp deficient. That is,

a control group student who was diagnosed more deficient

in memorization skills would be placed in the reasoning

with math class. The dependent variable used to measure

the effect of remediation was the score obtained by the

student on the class of questions for which he was diagnosed

as being most in need of remediation. For example, of the

students who scored lowest on questions requiring memoriza-

tion, half would be placed in the reasoning with math class

(the control group) and half would be placed in the memoriza-

tion class (the experimental group). In the subsequent

analysis,-only their scores on memorization questions

would be examined. This design was chosen to control for

the effect of students receiving additional help and indi-

vidual attention. If the Eypg of remediation is important,

one would expect the experimental group to perform better

on the remediated kind of thinking than the control group.

49

The Treatment

In Chemistry 131, the students take five examinations

and a final. As described earlier, students may repeat

examinations, within a specified time period until they

are satisfied with the grade they have obtained. The

final exam may not be repeated.

If, for example, a student is satisfied with the grade

he has obtained for exam 1, he will begin studying the topics

covered by exam 2. When he feels prepared, he takes his

first try of exam 2 and can then repeat exam 2 until he has

received a satisfactory grade, or until the deadline for

taking exam 2 had passed.

Participants in Project CLIC were asked to follow the

procedures outlined below in the order given.

1. To study the materials for each exam in their usual

fashion.

2. To take one attempt at an exam.

3. To listen to a CLIC Tape.

4. To attend a CLIC Class.

5. To retake the exam until satisfied with the grade

obtained.

This set of procedures was to be followed for each of

the first three examinations. There were no CLIC tapes or

classes provided for exams four and five.

Each CLIC tape begins with a discussion of the methods

by which a student could improve his skill in performing a

50

particular kind of thinking. This is followed by a discussion

of how the kind of thinking can be applied to the chemistry

topic being discussed. A detailed outline of the material

presented on each tape is given in Appendix D.

The CLIC classes followed a similar outline, but more

time was spent applying the kind of thinking skill to exam-

ples from the context of the course. Students were permitted

to ask questions and request that the instructor work problems

from the study guide or from previous tests. Whenever pos-

sible, the instructor would attempt to relate the answer to

a student's question to the kind of thinking being remediated.

Often questions were asked which related to the wrong kind

of thinking. That is, a student in the memorization class

would ask a reasoning with math type question. When this

occurred the instructor would simply work the problem with-

out relating it to a particular kind of thinking.

Evaluation of the Remedial System

In the evaluation of the effectiveness of the CLIC

Project, two distinct factors were considered. They can

best be described by restating two of the research ques-

tions posed in the first chapter.

Research Question b. Initial Learning

"Do the remedial materials which focus on the kind

of thinking required in a specific class of questions

51

covering a particular topic in chemistry improve stu-

dent performance on that class of questions for that

particular topic?"

Research Question 0. Transfer of Training

"Once students receive remediation on a particular

kind of thinking as it applies to general chemistry in

one topic, will there be any improvement in their

performance on questions requiring the same kind of

thinking but covering a different topic? That is, is

there transfer of training of a particular kind of

thinking to new content in the course?"

The initial learning score, which is related to

research question b, is defined as the percent correct

responses to questions which require the kind of thinking

for which the student was diagnosed in need of remediation

and which were answered during a student's second and sub-

sequent tries of exams 1, 2, and 3. This dependent variable

measures the effect of remediation which focuses on the kind

of thinking required in a specific class of question

covering a particular topic in chemistry, on student per-

formance on that class of questions for that particular

topic. Therefore, a comparison of values of the initial

learning score for experimental and control groups will

provide the necessary data to answer research question b.

The specific hypotheses developed for research question

b are as follows:

52

Hypothesis b1. For students diagnosed in need of

memorization remediation, the experimental group will

score higher than the control group on the initial

learning score.

Hypothesis b2. For students diagnosed in need of

reasoning with math remediation, the experimental

group will score higher than the control group on

the initial learning score.

To answer research question c, we define the transfer

score as the percent correct reSponses to questions which

require the kind of thinking in which the student is de-

ficient, and which were answered during the student's first

try of exams 2 and 3, and all tries of exam 4 and 5, and

the final. This dependent variable is a measure of a

student's performance on a particular class of questions

when new chemistry content is introduced. Therefore, an

examination of values of the transfer score for experimental

and control groups will provide the necessary data to answer

research question c.

The specific hypotheses developed from research ques-

tion 0 are as follows:

Hypothesis cl. For students diagnosed in need of


score higher than the control group on the transfer

score .

53

Hypothesis c2. For students diagnosed in need of

reasoning with math remediation, the experimental

group will score higher than the control group on the

transfer score.

Each of the hypotheses was tested using a simple t-

test with a = .05. The results of this analysis are shown

in Table V. For both the memorization and the reasoning

with math classes there was a significant difference be-

tween the experimental and control group for the initial

learning score but not for transfer score. Thus, the re-

medial classes appear to be effective at the time of remedia-

tion, but the learning does not appear to transfer to any

new content.

54

Table V. T-test Evaluation of Project CLIC

Memorization Class - Initial Learning Score

Mean Initial Standard

N Learning Score Deviation T-Value

P

Less Than

Experimental 15 .741 .076 5.39 .001

Control 10 .477 .167

Memorization Class - Transfer Score

Mean Standard P

N Transfer Score Deviation T-Value Less Than

Experimental 15 .350 .269 .04 .969

Control 10 .346 .203

Reasoning With Math Class - Initial Learning Score

Mean Initial Standard P

N Learning Score Deviation T-Value Less Than

Experimental 14 .667 .091 4.52 .001

Control 15 .522 .082

Reasoning With Math Class - Transfer Score

Mean Standard P

N Transfer Score Deviation T-Value Less Than

Experimental 14 .418 .124 .16 .872

Control 15 .426 .144

SUMMARY AND DISCUSSION

In this chapter results of the study will be summar-

ized and interpreted. The results of a test of each of the

hypothesies will also be presented.

Overview of the Project

This project has produced and tested a unique diag-

nostic remedial system which is based upon a classification

_of the tasks which students are asked to perform in a

general chemistry course. The classification scheme is

based on the kind of thinking required by the test questions

used in the course. The scheme has been evaluated by first

examining the extent of agreement obtained when content

experts classify questions and second by calculating inter-

item correlation coefficients for questions grouped according

to the categories proposed.

Two classes of questions (reasoning with math and .

memorization) were chosen as the basis of remediation.

Remedial materials and classes were made available to

students during the first half of Chemistry 131, Spring

Term, 1976. The effectiveness of this remediation was

measured by monitoring student performance on a specific

class of questions during the term. A distinction was

made between scores obtained from tests involving content

discussed in remediation and scores obtained when the

student was being introduced to new material. The latter

55

56

represents the effect of transfer of learning to think in

a particular way to new material in the course. This

transfer of training represents the ultimate goal of this

type of remediation.

The Validity of the Classification Scheme

Reliability

The reliability of the questions used in this study

compares favorably with the reliability reported for

standard achievement tests when adjusted for length. Since

the same bank of questions is used each term to create the

individual tests, questions that are misinterpreted by stu-

dents and questions which tend to mislead students have

been systematically removed from the file. This process

has created a set of questions which have withstood a great

deal of scrutiny by both students and faculty and are there-

fore considered to be good tests of student achievement.

It is important to remember that this test of reliability

refers to the measurement of overall achievement in the

course, and not to the measurement of a student's ability

to perform a particular kind of thinking. The latter would

be obtained by comparing scores on arbitrary halves of a

group of questions of the same kind of thinking. This type

of analysis was reported by Tannenbaum12 in his evaluation

of science processes. In the present study, the individual

57

fifteen item tests did not provide a large enough sample

of questions to permit this type of analysis to yield any

useful results.

Interclassifier Agreement

Table VI summarizes the results obtained when content

experts who are familiar with the specific nature of the

course attempt to classify questions according to the

proposed scheme. The agreement for the memorization and

reasoning categories, which is much higher than the other

categories, may well be due to the large number of questions

which require these kinds of thinking. The discrepencies

which did occur were usually caused by a classifier omitting

a kind of thinking because it was trivial compared to another

kind of thinking required by the question. This problem

also accounted for most of the discrepencies which occurred

for the translation, visualization and math categories.

There was considerable discussion among the chemistry

faculty concerning the distinction between reasoning and

memorization. Questions which required a very simple

application of a principle were considered by some to be

memorization and by others to be reasoning. For example,

a question which requires the determination of the density

of an object given its mass and volume, would be classified

according to the scheme as a reasoning question because it

requires the student to apply the relationship between

58

Table VI. A Summary of Interclassifier Agreement.

Kind of Thinking Percentage Agreement

Memorization 92.6%

Reasoning 95.1

Translation 75.8

Classification 75.0

Visualization 75.9

Math 77.8

59

density, mass and volume. It has been argued that this

question requires only the memorization of the relation-

ship and the reasoning is trivial. To resolve this problem,

one might measure the correlation between these kinds of

questions and questions which are definitely in the memory

category and compare the result with correlations between

these kinds of questions and questions which are definitely

in the reasoning category. That is, let an analysis of

student performance determine the category into which these

types of questions would be placed.

The category called "classification" proved to be

extremely difficult to use because it is actually a sub-

set of the reasoning category. The differences between

these two categories are too subtle to yield reliable

classifications.

It has been suggested that the ease and reliability

of question classification might be improved by asking the

classifier to simply choose the one most important kind of

thinking in a particular question. The "most important"

kind of thinking would be defined as the kind of thinking

which would be most likely to cause the student to miss

the question. This kind of analysis would eliminate de-

ciding whether a kind of thinking was too trivial to include

in the characterization of the question but adds a decision

concerning the relative importance of more than one kind

of thinking when several are required by a question.

The data presented in Table II (page 42) amplifies the need

60

for some type of modification of the classification scheme.

It would seem appropriate to deal specifically with the

translation, visualization and math categories and to

consider the elimination of the category of classification.

Restricting each question to only one category may also

improve the usefulness of the classification scheme.

We recognize, however, that it may be unreasonable

to classify a question into only one category when the

question clearly requires two different kinds of thinking.

A further refinement of the classification scheme may be

to assign weighting factors to indicate the relative

importance of the contributing categories.

Analysis of Correlation Coefficients

A comparison of inter-item correlation coefficients has

been made for questions grouped according to content and

kind of thinking. The first hypotheses tested by this

analysis are as follows:

Hypothesis 1: The average of correlation coefficients

for group A will be higher than the average for group B.

Hypothesis 2. The average of correlation coefficients

for group C will be higher than the average for group D.

A planned comparison between the correlations among

group A questions and the correlations among group B ques-

tions revealed a significant difference for only the reason-

ing with math category. (See Table III, page 43). Thus,

61

for questions of the same content, the fact that they are

also of the same kind of thinking will significantly raise

the correlation only for the reasoning with math category.

For the memorization category, the average correlation for

group A was substantially greater than that for group B

but the difference was not significant at the a = 0.05 level.

The data indicate that a subsequent analysis using a larger

sample of tests would probably produce a significant dif-

ference between groups A and B for the memorization category.

A test of the differences in correlations for groups C

and D revealed significant differences for only the mem-

orization and reasoning with math categories. That is,

when one compares correlations between pairs of items each

from a different topic, the fact that the items are from

the same kind of thinking seems to increase the correlation

for the reasoning with math, and the memorization categories

but not for the visualization, translation, or reasoning

categories.

A comparison which was not tested statistically but can

be made informally is that between the average correlation

coefficients for groups B and C. An examination of Table

II, page 42, reveals that for most categories the differences

between these two groups are relatively small. That is, the

correlation between questions of the same content but dif-

ferent kind of thinking are not significantly different from

the correlation between questions of different content but

the same kind of thinking. It was initially thought that the

62

content which a question is testing will be the most impor-

tant factor controlling student performance on that question

In this study the constraint of using fifteen item tests

as the basis of the analysis forced a rather broad defini-

tion of each content category. If a stricter delimiting of

content categories were used, the content correlation would

probably increase significantly.

The third hypothesis of this part of the study is:

Hypothesis 3. The average correlation between items

from the same topic will be higher than the average

correlation between items from different tOpics.

This hypothesis was supported for all of the categories

tested except for visualization. The distribution of

visualization questions was such that relatively few cor-

relation coefficients could be obtained for most tests.

Thus the averages tended to fluctuate more than they did

for the other categories. The data obtained for visualiza-

tion are probably a result of this fluctuation.

The hypothesis most directly related to the validation

of the categories is hypothesis 4 which states:

Hypothesis 4. For each of the six major classes of

questions, the average correlation between items of

the same class will be greater than the average cor-

relation between items from different classes.

The data shown in Table IV, page 45’support this hypothesis

63

for the categories memorization, reasoning and reasoning

with math. The hypothesis is not supported for visualiza-

tion and translation and was not tested for classification.

Unlike the visualization category, the translation cate-

gory showed a significant topic main effect but did not

show a kind of thinking main effect. Thus the category

translation is not validated by student performance. It

may be that translation, while identifiable, is too closely

related to the more prevalent categories memorization and

reasoning to be distinguishable by an examination of stu-

dent performance. To summarize this part of the study,

we can say that questions requiring memorization tend to

correlate higher with one another than they do with ques-

tions requiring some other kind of thinking. This is also

true for reasoning and reasoning with math questions.

Evaluation of Project CLIC

The categories of memorization and reasoning with math

were chosen to be the basis for the experimental remedia-

tion. The evaluation hypotheses are divided into two

parts. The first being those related to student progress

with material being learned during remediation, and the

second being related to achievement when new chemistry

content is introduced. These latter hypotheses have been

identified as the transfer hypotheses because they relate

to the student's ability to transfer what he has learned

64

about a particular kind of thinking to new content in the

course.

The initial learning score has been defined as the propo -

tion of correct responses to questions from topics which have

been discussed in the context of the kind of thinking remedia-

tion. The transfer score is the proportion of correct

responses to questions from topics not yet discussed during

remediation. The hypotheses concerning the initial learning

score are as follows.

Hypothesis b1: For students diagnosed in need of


score higher than the control group on the initial

learning score.

Hypothesis b2: For students diagnosed in need of

reasoning with math remediation, the experimental group

will score higher than the control group on the initial

learning score.

For the memorization group, the average value of the

initial learning score was 0.741 for the experimental group

and 0.477 for the control group. The probability that these

values are different only by chance is less than 1 in 1000.

For the reasoning with math group, the values were 0.667

for the experimental group and 0.522 for the control group.

This difference is also highly significant. These data

indicate that students who received remediation on a

65

particular kind of thinking did better on questions requir-

ing that kind of thinking than did students who received

remediation on some other kind of thinking. Unfortunately

these results do not prove that the kind of thinking

addressed by the remediation was the determining factor.

In attempting to teach the student to think in a particular

way, we used specific examples from the present content of

the course. It may be that the students simply learned

from these examples and from the additional presentation

of the related content. This would not have been as sig-

nificant a problem if the two kinds of thinking were evenly

distributed throughout the various topics and subtopics of

the course. This, however, was not the case. Thus, the

tests of initial learning for the control group probably

contained less of the content discussed in remediation than

the test for the experimental group.

The transfer hypotheses related to the question of

transfer are as follows.

Hypothesis cl: For students diagnosed in need of


score higher than the control group on the transfer

score .

Hypgthesis c2: For students diagnosed in need of

reasoning with math remediation, the experimental group

will score higher than the control group on the transfer

score.

66

Neither of these hypotheses has been supported by the

data. For the memorization class, the average value ofthe

variable was 0.350 for the experimental group and 0.346

for the control group. The difference is not significant

at the a = .05 leve1.Fbr the reasoning with math class,

the value for the experimental group was 0.418 and the

value for the control group was 0.426. The difference

between these two values is also not significant.

With this type of data, there are many alternative

hypotheses which can be put forth. I will mention two and

discuss each briefly.

The data indicate that the remediation had some positive

effect but they do not support the statement that this

positive effect was the result of improving students'

ability to think in a particular way. It may be that the

remediation failed to teach the students very much about

the kind of thinking involved but instead simply presented

once again, a certain segment of the material. Since

the remedial material suggested specific methods and

strategies for the students to use, it was possible to

determine if these methods were being used successfully.

An informal check of the notebooks of five students indi—

cated that three of the five were using most of the tech-

niques and two were not using them at all. During the

reasoning with math remediation, the instructor would often

ask students to work problems using the techniques that

had been discussed. Many of the students did not demonstrate

67

that they had mastered the technique of dimensional analysis

which was one of the topics discussed in the class. In

general, one could conclude that some of the students

simply did not master the key techniques and did not learn

the key concepts of the remedial material. This suggests

that the remedial classes could be more effective if based

on a mastery model designed so that all students mastered

the basic skills and ideas of the remedial class. There

would however, be practical problems involved in requiring

mastery of material which would be viewed by students

as supplementary to the material of the course.

Another alternative hypothesis is that students who

mastered the skills presented in the remedial class were

unable to apply those skills when dealing with new material

in the course. Learning to apply one's knowledge to new

situations has always been an important goal of education.

Many remedial programs in science education have been

based on the idea that the mastery of certain basic skills

and ideas which are applied in a discipline will improve

ones learning in that discipline. This is very appealing

because once the student has learned the skill, he will

supposedly be able to use that skill throughout his study.

What is sometimes forgotten is that knowing how to do some-

thing is not exactly the same as knowing when to do some-

thing. A student who has learned the technique of dimen-

sional analysis, for example, may be able to apply it to

a problem if dimensional analysis is fresh in his mind or

68

if he is instructed to use the technique, but may not

think to apply it to a new problem encountered a week later.

Successful attempts to teach a general learning skill and

to demonstrate the transfer of that skill have been re-

d.24 Hopefully more of this type of research willporte

be forthcoming.

The study presented here represents an initial step

in the application of current ideas in the field of edu-

cational psychology to the teaching of freshman chemistry.

We obviously have a great deal to learn about the teaching

and learning of chemistry at this level.

IMPLICATIONS FOR FUTURE RESEARCH AND DEVELOPMENT

Like many research projects, this study has created

more questions than it has answered. It has also initiated

the development of a unique learning laboratory for the

study of the learning of chemistry at the college level.

This chapter will outline the characteristics and poten-

tial of this learning laboratory and will also present

several proposals for a continuing research program in

chemical education.

A Chemical Education Laboratogy

The sequence of courses to which this study was ap-

plied has an average enrollment of approximately 1500

students per term. The instruction which is delivered to

the students through the taped cassettes can be thought

of as a very controllable and well defined experimental

treatment. One knows exactly what information has been

communicated through the tapes to the student. This informa—

tion does not vary uncontrollably from term to term as does

the information delivered via a lecture mode. The instru-

ments used to measure achievement are created from a bank

of questions which can also be easily controlled.

Although each question is used only once during a

term, it is in most cases used again in subsequent terms.

By studying student performance on individual questions

69

70

or groups of related questions one can gain valuable in-

formation about which concepts or ideas are being communi~

cated effectively and which parts of the course need improve-

ment. Also, anytime that modifications of instructional

materials are made, they can be easily tested by establish-

ing an experimental group who study the new material and

compare this group's achievement with that of a control

group which studies the old material. In this manner,

objective evidence concerning the effectiveness of instruc—

tion can be routinely obtained.

The computer management system needed to compile and

analyze the data has been established and is presently

working well, although many improvements have been prOposed.

As described in Chapter 3, the students mark their answers

on machine scorable answer sheets which are then processed

by an optical scanner linked to a CDC 6500 computer.

Student performance data is stored in disc files which in

this study are analyzed by the Statistical Package for the

Social Sciences subroutines. This computerized record

keeping and data analysis system permits the processing

of thousands of pieces of data with a relatively small

expenditure of resources. Students take an average of two

attempts at each exam. Including the final, there are a

total of 13 exams administered during the two term sequence.

Obviously, this amount of testing creates a large quantity

of data in a relatively short time.

Computerized record keeping also makes it easier

71

to store from term to term statistical data on each item

in the question bank. Presently, only an index of difficulty,

defined as the proportion of students who get the question

wrong, is being stored along with the number of students

who have answered the question. There is theoretically no

limit to the information concerning each question which

could be stored on the question file. Considering the

number of students who take introductory chemistry and

considering the increasing numbers of students who find

the kinds of thinking required in introductory chemistry

difficult, it would appear that the information to be

gained from this learning laboratory would be an important

contribution to chemistry instruction. Specific sugges-

tions for research and development studies are outlined

in the next section.

The Difficulty Index

As mentioned earlier, the routine processing of informa—

tion includes calculating a difficulty index for each ques-

tion. Preliminary studies indicate that these indices

vary dramatically. There is a significant number of ques-

tions which over 90 percent of the students get wrong and

there is also a significant number which less than 10

percent get wrong. To this author's knowledge, most

instructors using the mastery approach which requires

repeated examinations, assume that their alternate exam

72

forms are of approximately the same difficulty. The pre-

liminary data obtained in this study indicate that this is

probably not the case. It is therefore recommended that a

careful study of exam form difficulty be undertaken and,

if necessary, a method be established to keep the dif—

ficulty of exams to within an acceptable limit.

A related area of interest is the relationship between

the difficulty of a question and the kind of thinking (as

defined by this study) which the question requires. An

informal inquiry into this question indicates no difference

in the average difficulty of the various classes of ques-

tions. The memory questions for example do not appear to

be any easier or any more difficult than the reasoning with

math questions. This needs to be studied more carefully

on a long term basis.

Throughout the present research project there has been

concern expressed about the effect of not taking into ac-

count the inherent difficulty of a question when attempting

to validate the classification scheme. It seems logical

that a student who answers three difficult memory questions

correctly should receive a higher score in ”memory ability"

than a student who answers three easy memory questions.

The details of assigning some type of weighting factor for

the purposes of this analysis need to be considered care-

fully. An alternative to the weighting factor would be

to control the difficulty of the questions used so that

the scores obtained are the result of questions of

73

approximately the same difficulty.

Alternative Validity Studies

There have been many suggestions made pertaining

to the validity test of the proposed classification cate-

gories. This section will discuss two of these plans.

The analysis of correlation coefficients was performed

using fifteen item examinations as the basic instrument.

This produced a relatively small number of correlation co—

efficients in each of the groups A, B, C, and D which in

turn caused the averages to be calculated from as few as

three coefficients. This has in some cases produced an

unstable statistic and may account for the inability to

obtain significant differences. If a substantially longer

exam were given, the number of inter-item correlation co-

efficients would also increase as would the stability of

the average. The number of useful coefficients obtained

from a test of n items is %In (n-l).

It has also been pointed out that the present classifica-

tion scheme permits the assignment of a question to more

than one class of questions. For example, a question

characterized as Rs (R1, V) would be included in both the

visualization and the reasoning category even though one of

the two may have no impact on performance on the question.

The alternative is to classify each question into only one

category, that category being the kind of thinking which

74

is most likely to cause the student to miss the question.

This strategy of classification is more likely to group

similar questions together and hence should improve the

average correlation between questions of the same class,

and may even result in a significant difference for the

visualization and translation categories.

Selecting the Sample

A student's ability to think in a particular way is

obviously only one of the factors that influence perfor-

mance in Chemistry 130-131. Basic interest in chemistry

and motivation to study are also very important factors.

Most of the students enrolled in this sequence of courses

are not chemistry majors but are taking the courses be-

cause they are required to do so by their major area.

In selecting those students who scored below 60 percent

on a particular kind of thinking, we chose students who were

for the most part in the bottom 40 percent of the class

when ranked by gradepoint in CEM 130. It is quite probable

that this group on the average has a lower level of interest

and motivation to study chemistry than a group ranking from

40 percent to 75 percent in class average. It has been

suggested that the CLIC project may have been more success-

ful if applied to this latter group, since these students

are likely to be more motivated and interested in the study

of chemistry. If the CLIC experiment could be run with

75

both of these groups simultaneously some interesting com-

parisons of achievement, improvement and participation could

be made.

A Piagetian Classification

A theory of intellectual development which has had an

impact on the teaching of chemistry at the college level

is that advanced by Swiss psychologist Jean Piaget. An

important aspect of Piaget's theory is his "stages of

intellectual development" which are listed below.

1. Sensori-motor stage (0-2 years)

2. Preoperational stage (2-7 years)

3. Concrete separations (7-11 years)

4. Formal operations.

Each stage represents a distinct set of abilities which

are usually developed during the ages indicated. Piaget

has designed many different tests of intellectual develop-

ment which are supposed to identify in which of the four

stages a person is operating. As mentioned earlier, recent

applications of the tests to college students indicate that

many of these students do not demonstrate formal thinking

in specific situations. In simple terms, this means that

the student does not deal successfully with problems re-

quiring the formation and testing of hypotheses involving

relationships between variables, or problems requiring

76

the control of one variable, by systematically testing

all possibilities.

Chemists have begun to look critically at the informa-

tion concerning the intellectual development of college

students and ask what effect this situation might have on

the teaching of college chemistry.35:37It is generally

agreed that much of the thinking required in introductory

chemistry is at the formal level. We are only beginning

to sort out specifically which of the various tasks re-

quired of an introductory chemistry student would not be

done by someone not demonstrating formal thinking. Some

initial work on this question has been reported by Herron38

who identifies tasks which he believes require formal

thought but does not present any empirical evidence to

support the identification.

The kind of analysis employed in the present study

would be ideally suited to the development and testing of

a classification scheme based on Piaget's concrete and

formal operations levels of intellectual development.

Questions used in Chemistry 130—131 could be classified

according to the Piagetian level required and a correlation

analysis could be run. In addition, the score on a set

of items from a chemistry test could be compared to scores

on a traditional Piagetian test. If a reliable classifica-

tion of general chemistry questions can be made, then

a diagnosis of the Piagetial level demonstrated by students

taking chemistry could be routinely obtained.

77

Furthermore, procedures for increasing the students'

tendency to think at the level of formal operations could

be developed in the context of a freshman chemistry course.

This last development would be of extreme importance to

instruction since so little presently is known about how

or if a person who is not in the practice of thinking at

the formal level in a particular situation, can be taught

to do so.

REFERENCES

12.

13.

14.

15.

16.

17.

18.

19.

20.

B. S. Bloom, Taxonomy pf Educational Objectives: Cogni-

tive Domain, David McKay, New York (1956T.

K. V. Fast, Dissertation Abstracts, g1, 2194A (1972).

P. W. Airasian, Science Education, g5, 91-95 (1970).

H. V. Scott, Science Education, £1, 291-296 (1973).

American Association for the Advancement of Science,

Science Q Process Approach, Washington, D.C. (1964).

E. L. Smith, American Educational Research Association

Annual Meetings, Chicago, Illinois (1974).

R. M. Gagne, Interchange, ;, 1-8 (1972).

R. M. Gagne, The Essentials 9: Learning for Instruction,

Dryden, Hinsdale, Illinois (1974).

R. M. Gagne, Educational Psychologist, g, 1-9 (1968).

R. M. Gagne, Psychological Review, gg, 355-365 (1962).

R. M. Gagne and L. T. Brown, Journal g: Expegimental

Psychology, 62, 313-321 (1961).

R. S. Tannenbaum, Journal pf Research i3 Science Teaching,

8, 123-136 (1971).

C. H. Stedman, Journal 9; Research lg Science Teaching,

12, 235-241 (1973).

Jean Piaget, Journal pf Research i3 Science Teaching, 3,

176-186 (1964).

H. Beilin, Piagetian Research and Mathematical Education,

National Council of Teachers of Mathematics, Washington,

D.C. (1970).

T. A. Bredderman, Journal pf Research in Science Teach-

ing, 19, 189-200 (1973).

J. W. McKinnon, American Journal pf Physics, 32, 1047-52

(1971).

A. E. Lawson, and J. W. Renner, Science Education,

59. 545-559 (1974) .

D. Griffiths, Unpublished Ed. D. Dissertation, Rutgers

University, New Brunswick, NJ (1973).

F. W. Danner and M. C. Day, American Educational Re-

search Association Annual Meeting, San Francisco, CA,

April (1976).

78

21.

22.

23.

24.

25.

26.

27.

28.

29.

30.

31.

32.

33.

34.

35.

36.

37.

38.

79

M. R. Lawler and M. Riser, The Journal 92 Experimental

Education, 55, 45-52 (1974).

B. Z. Shakhashiri, Journal 2: Chemical Education, 52,

588-592 (1975).

D. M. Riban, Journal of Research 52 Science Teaching,

3, 72-82 (1969f. ‘—

J. H. Larkin, American Association of Physics Teachers

Meeting, Chicago, Illinois (1975).

R. N. Hammer, 167th National Meeting, American Chemical

Society, Los Angeles, California, March 1974.

Benjamin S. Bloom, Evaluation Comment, 2 (1968).

Fred S. Keller, Journal 2: Applied Behavioral Analysis,

2 (1968), 79-89.

James H. Block (ed.), Master Learning: Theory and

Practice. New York. HoIt, Rinefiart and WIHSton, Inc.,

1971.

R. H. Davis, L. T. Alexander, and S. L. Yelon, Learning

System Design, McGraw-Hill Book Company, New Yor

(1974).

R. L. Ebel, Essentials of Educational Measurement,

Prentice-Hall,Inc.,EngIewood Cliff§, New Jersey (1972).

E. Kales, Personal communication, March 1976.

D. T. Campbell and D. W. Fiske, Psychological Bulletin,

55, 82 (1959).

G. V. Glass and J. C. Stanley, Statistical Methods in

Education and Psychology, Prentice-Hall, Inc., EnglEWood

CIiffs, New Jersey (19 0).

D. G. Boyle, Students' Guide 59 Piaget, Pergaman Press,

New York (1969Y.

J. Piaget, and B. Inhelder, The Early Growth 25 Logic

in the Child, W. W. Norton and Company, Inc., New York

T196597.

B. S. Craig, Journal 22 Chemical Education, 45, 807

(1972).

D. W. Beistel, Journal 9: Chemical Education, 52, 151

(1975).

J. D. Herron, Journal 92 Chemical Education, 52, 147 (1975).

APPENDICES

APPENDIX A

FORTRAN PROGRAMS

The following are listings of the fortran programs

used to perform the data analysis for this study. Each

listing begins with a brief description of the function of

each program.

Program READ

Program READ requests information from a nine-track

tape prepared at the scoring center. A disc file (student

record file) is created in which each record contains the

student number, exam form number and an indication of the

student's responses to each question.

FTN.

MAP(OFF)

ATTACH,TAPEIZ,WAITSP76131,PW=OLIVIA.

LGO.

UNLOAD,TAPE12.

REWIND,TAPE14.

REWIND,TAPEIS.

SORTMRG.CHEM131

CATALOG,TAPE69,SAVETHIS,RP=999,ID=BOB,TK=OLIVIA.

REWIND,TAPE69.

LISTTY,I=TAPEG9,B,NS,1-20.

PROGRAM A(INPUT=64,0UTPUT=512,TAPE12=512,TAPEI4=512,

80

81

XTAPEG=512,TAPES=512,TAPEB9=64,TAPESO=512,TAPESI=512,TAP

E36=512,

(15).

232

XTAPEB7=512)

DIMENSION JK(15),IANS(15),LD(3),IQCP(4),IACP(4,12),HOLD

XIMZT(3),LTE(6),LF(3)

IEND=0

LS=9

ACOUNT=0.

DO 232 NMR=1,15

HOLD(NMR)=0.

IQZ=0

ICHM=0

ICP=0

DO 2 JL=1,20000

C READS THE 9 TRACK TAPE FOR NEW INFORMATION.

1

READ(12,1)IT,ID,(JK(NA),NA=1,15)

FORMAT(I6,I3,15R1)

IF(EOF(12).NE.0)GO TO 98

IF(IT.NE.1)GO TO 68

C STUDENT NO. 1 READ AS SPECIAL INFORMATION

78

66

160

164

98

2009

100

530

283

291

506

544

ILQ=0

DO 78 IXD=1,4

IQCP(IXD)=0

DO 66 NJ=1,3

LD(NJ)=JK(NJ)

REWIND 89

WRITE(89,160)IT,ID,(JK(NA),NA=1,15)

FORMAT(16,I3,15R1)

REWIND 89

READ(89,164)IT,ID,(JK(NA),NA=1,15)

FORMAT(I6,I3,3R1,1211)

LS=JK(9)

IF(IQZ.EQ.0)GO TO 853

GO TO 100

IEND=1

DO 2009 NMR=1,3

LF(NMR)=LD(NMR)

LN=LS

IF(LN.EQ.1)GO TO 427

IECIS=0

REWIND 50

IF(IECIS.EQ.15)GO TO 952

READ(50,291)(IMZT(NMR),NMR=1,3),HMZT,(LTE(NMR),NMR=1,6)

FORMAT(R1,Rl,Rl,12,I6,A10,A10,Al0,A10,A10)

IF(EOF(50).NE.0)GO TO 506

GO TO 730

PRINT 544,(LF(NMR),NMR=1,3)

FORMAT(* HELP130*,5X,II,11,I1)

IF(IEND.EQ.1)GO TO 38

GO TO 854

82

730 IF(IMZT(1) .ED.LF(1) .AND.IMZT(2) .ED.LF(2) .AND.IMZT(3) .ED.LF(3) )804,

X283

804 IECIS=IEXIIS+1

C ABZT IS THE VALUE OF THE RATIO OF THE NUMBER WRONG TO THE TOTAL (ACDU

NT)

ABZT=HOLD ( IECIS) /ACIIJNT

IZZ=ACOUNT

WRITE(36,299) (IMZT(NMR) ,NMR=1 ,3) ,HMZT, (LTE(NMR) ,NMR=1 ,6) ,IZZ,ABZT

299 FORMAT(R1,R1,R1,IZ,I6,A10,A10,A10,A10,A10,I4,F8.6)

GO TO 530

427 IEKIIS=0

REWIND 51

430 IF(IE)CIS.ED.15)GO TO 952

429 RFAD(51,391) (IMZT(NMR) ,NMR=1,3) ,PHZT, (LTE(NMR) ,NMR=1,6)

391 FORMAT(R1,R1,R1,12,16,A10,A10,Al0,A10,A10)

IF(ECF(51) .NE.0)GO TO 510

GO TO 731

510 PRINT 509,(LF(NMR) ,NMR=1,3)

509 FORMAT(* HEIPl31*,5X,Il,Il,Il)

IF(IEND.EQ.1)GO TO 38

GO TO 854

731 IF(IMZT(1) .ED.I..F(1) .AND.IMZT(2) .EX).LF(2) .AND.IMZT(3) .EX).LF(3) )805,

X429

805 IECIS=IECIS+1

C ABZT IS THE VALUE OF THE RATIO OF THE NUMBER WRONG '10 THE TOTAL (AmU

NT)

ABZT=HOLD ( IECIS) /A(IIJNT

IZZ=ACDUNT

WRITE(37,296) (IMZT(NMR) ,NMR=1 ,3) ,HMZT, (LTE(NMR) ,NMR=1,6) ,IZZ,ABZT

296 FORMAT(R1,R1,Rl.12,I6,A10,A10,A10,A10,A10,I4,F8.6)

GO TO 430

952 II) 230 NMR=1,15

230 HOLD(NMR) =0.

IF(IEND.EQ.1)GO 'IO 38

ACUJN'I‘=0.

GO TO 854

68 IF(IT.NE.2)GO '10 81

C STUDENT NO. 2 READ AS KEY

82 II) 84 NJ=1,15

84 IANS(NJ)=JK(NJ)

G0 '10 2

C STUDENT NO. 3 READ FOR CORRECTIONS TO THE KEY

81 IF(IT.NE.3)GO ‘IO 67

ILQ=1

REWIND 89

WRITE(89,1201) IT,ID, (JK(NA) ,NA=1 ,15)

1201 FORMAT(16,I3,15R1)

REWIND 89

READ(89,1202) IT,ID, (JK(NA) ,NA=1 ,15)

1202 FORMAT(16,I3,211,13R1)

ICP=ICP+1

ICflP(ICP)=JK(1)*10+JK(2)

83

DO 77 ICD=3,12

77 IACP(ICP,ICD)=JK(ICD)

CD ‘10 2

C CONVERI'S RESPONSES(R FORMATL A 1 IF CORRECT, A LEI'IER REPRESENTED TH

E

C RESPONSE CPIBEIN IF INCORRECT)

67 DO 17 NJ=1,15

IF(IANS(NJ)-JK(NJ))14,18,14

18 JK(NJ)=1R1

(I) TO 17

14 IF(IID.E0.0)GO ‘IO 71

DO 79 NA=1,4

IF(NJ.EO.IQCP(NA))GO ‘IO 39

79 CONTINUE

GO TO 71

39 DO 75 III=3,12

IF(IACP(NA,III) .ED.1R9)GO 'IO 71

75 IF(JK(NJ) .EO.IACP(NA,III) )GO TO 18

71 JK(NJ)=JK(NJ)-3ZB

17 CONTINUE

ACOJINT=AOIJNT+L

DO 210 NMR=1,15

IF (JK(NMR) .EO.1R1)GO ‘10 210

C HOLD KEEPS TRACK OF THE NUMER WRONG PER EXAM FORM NUMBER

HOLD (NMR) =HOID (NMR) +1.

210 CONTINUE

IF(LS-1)94,27,27

94 WRITE(5,2047)IT,ID

2047 WT(I6,I3.3)

C WRITES ON TAPE 14 FG? CHEM 130 STUDENTS

WRITE(14,90) IT, (LD(NA) ,NA=1 ,3) ,ID, (JK(NA) ,NA=1,15)

90 FORMAT(I6,3R1,I3,15R1)

m TO 2

27 WRITE(6,1749)IT,ID

1749 FORMAT(I6,I3.3)

C WRITES ON TAPE 15 FOR CHEM 131 STUDENTS

WRITE(15,91) IT, (LD(NA) ,NA=1 ,3) ,ID, (JK(NA) ,NA=1 ,15)

91 FORMT(I6,3R1,I3,15R1)

GO TO 2

853 IQZ=1

854 DO 609 NMR=1,3

609 LF(INMR)=LD(NMR)

LN=LS

IF(ICI~M.NE.I)GO TO 126

C WRITBS ON TAPES FOR ACCESS BY TELETEST FOR CHEM 130 SCORES.

WRITE(5,70)LD(1)

70 FORMAT(*S(DRE*/*SO)RE*/Rl ,* ,0*)

C WRITES ON TAPEG FOR ACCESS BY TELETES‘T FOR CHEM 131 SCORES.

WRITE(6,72)LD(1)

72 FORMAT(*SCI)RE*/*SCDRE*/R1 ,* ,0*)

ICHM=1

126 IF(LS-1)1524,1523,1523

1524 WRITE(5,8S) (JK(NA) ,NA=4 ,8) ,LD(1)

85 FORMAT(*CD*/2I1.*.*.2I1/Rl,*.0*/*AUIO*/Rl)

84

m TO 2

1523 WRITE(6,97) (JK(NA) ,NA=4,8) ,LD(1)

97 FORMAT(*CD*/211,*.*,ZIl/R1,*.0*/*AUTO*/R1)

2 CONTINUE

38 WRITE(5,74)

74 FORMAT(*E*)

WRITE(6,105)

105 FORMAT(*E*)

PRINT 1005,JL

1005 FORMAT(* THE LOOP IS NON AT*,IS)

98 EDDFILE 14

ENDFILE 15

EM)

SORT(1,1,90)

FILE (TAPE15,S,D, ,O,N)

FILE(TAPEG9,0,D, ,O,N)

KEY(A,C,7 ,9)

REEORD(I,U,90)

END

SORT(1,1,90)

FILE (TAPE37 ,S,D, ,O,N)

FILE (TAPE69,M,D, ,O,N)

FILE (TAPE47 ,O,D, ,O,N)

KEY(A,C,1,5)

REXZORD(I,U,90)

SORT(1,1,90)

FILE (TAPE15 ,S,D, ,O,N)

FILE (TAPE68 ,M,D, ,O,N)

FILE (TAPE60 ,O,D, ,O ,N)

KEY(A,C,7 ,9)

REEORD(I,U,90)

EM)

Program PERCENTAGES

Program PERCENTAGES processes the ECI file and the student record

file and calculates the proportion of questions from a specified class

of questions, which the student has answered correctly.

ATTACH ,TAPESl , SPMP ,PW-=OLIVIA .

ATTACH ,TAPE52 , SPMR , PW=OLIVIA .

ATTACH ,TAPE53 , SPMl ,PW=OLIVIA.

ATTACH ,TAPE54 , SPMZ ,PW=OLIVIA .

SORTMRG .

SORTMRG .

REWIND ,TAPEZ .

REWIND ,TAPE8 .

ATTACH ,TAPE10 , FINALl 31 ,PW=OLIVIA.

85

FIN.

mo.

CATALCE ,TAPEl 5 ,SP76FINAL131 , ID=BOB ,RP=999 ,TK=OLIVIA.

REWIND,TAPE15.

COPYSBF ,TAPEIS ,OUT'PUT .

SORT(2,1 ,90)

FILE(TAPE51 ,S,D, ,O,N)


FILE (TAPE2,0,D, ,O,N)

KEY (A,C,1 ,5)

RECORD(I ,U,90)

END

SORT(2,1 ,90)



FILE (TAPES ,O,D, ,O,N)

KEY(A ,C ,1 ,5)

RECORD(I ,U,90)

END

PRmRAM B (INPUT=64 ,OUTPUT-f512 ,TAPE2=512 ,TAPE8=512 ,TAPE10=512 ,TAPE

Xl4=512 ,TAPEl 5=512 ,TAPE20=51 2 ,TAPE80=512)

DIMENSION JK(15) ,IBMA(15) ,ICMA(15) ,IHAD(15) ,IBT(15) ,AN(15)

400' ILTC=0

KN=0

KQ=0

KL=0

ILT‘=0

19 READ(10,3)ISN,K,NS, (JK(NA) ,NA=1,15)

3 EORMAT(I6,I3,I3,15A1)

IF(EOF(10) .NE.0)GO TO 121

4634 BP=0.

BR=0.

8&0.

IF(ILT.EQ.1)GO TO 139

98 IF(KL-K)62,36,139

36 II) 5 NA=1,15

IF(IBMA(NA) .ED.0)GO TO 139

NIFIBMA(NA)

BP=BP+1.

IF (JK(NA) .NE.1H1)GO TO 97

101 BR=BR+1.

97 BC=BR/BP*100.+.5

5 CONTINUE

139 IBP=BP

IBR=BR

IBC=BC

GO TO 39

17 IL'I‘=1

GO TO 139

39 WRITE(14,8) IQN,K,NS, (JK(NA) ,NA=1,15) ,IBR,IBP,IBC

8 FORMAT(I6,I3,12,15A1,I2,IZ,I3)

GO TO 19

62 NB=1

DO 72 NA=1,15

86

72 IBMA(NA)=0

IF(KL.EQ.0)GO TO 78

IBMA(1)=KQ

NB=2

78 DO 52 NZ=NB,16

KL=KN

IF(ILTC.ED.1)GO TO 17

83 READ(2,1)KN,IBMA(NZ)

1 FCRMAT(I3,IZ)

IF (EGWZ) .NE.0)GO TO 57

IF(KL.EQ.0)GO TO 52

IF(KN-KL)52,52,61

52 CONTINUE

GO TO 61

57 ILT'C=1

61 KQ=IBMA(NZ)

IBMA(NZ)=0

GO TO 98

121 REWIND l4

ILTC=0

ILT=0

KN=0

KQ=0

KL=0

16 READ(14,99) ISN,K,NS, (JK(NA) ,NA=1 ,15) ,IBR,IBP,IBC

99 FORMAT(I6,I3,IZ,15A1,I2,I2,I3)

IF(EOF(14) .NE.0)GO TO 2027

CP=0.

CR=0.

CC=0.

IF(ILT.EQ.1)GO TO 149

88 IF(KL—K)9,46,149

46 DO 26 NA=1,15

IF(ICMA(NA) .ED.0)GO TO 149

NL=ICMA(NA)

CP=CP+1.

IF(JK(NA) .NE.1Hl)GO TO 87

111 CR=CR+1.

87 CG-CR/CP*100.+.5

26 CONTINUE

149 ICP=CP

ICR=CR

ICC=CC

GO TO 49

27 ILT=1

GO TO 149

49 WRITE (15,18)ISN,K,1NS, (JK(NA) ,NA=1 ,15) ,IBR,IBP,IBC,ICR,ICP,ICC

18 FORMAT(I6,I3,12,15A1,12,IZ,I3,12,12,I3)

GO TO 16

9 NB=1

DO 71 NA=1,15

71 ICMA(NA)=0

IF(KL.EQ.0)GO TO 68

IONA(1)=KQ

87

NB=2

68 DO 51 NZ=NB,16

KIFKN

IF(ILTC.ED.1)GO TO 27

84 READ(8,10)KN,ICMA(NZ)

1I FORMAT(I3,12)

IF(EO?(8) .NE.0)GO TO 56

IF(KL.ED.0)GO TO 51

IF (KN-KL) 51 , 51 ,41

51 CONTINUE

(I) TO 41

56 ILTC=1

41 KQ=I01A(NZ)

IONA(NZ)=0

GO TO 88

2027 CONTINUE

END

Program PRINTOUT

Program PRINTUUT calculates average subtest scores for all tries

of each exam for each student and then a grand average subtest score

for the entire term.

ATTACH ,TAPE15 ,SAVEWAIT76131 ,PW=OLIVIA.

PNPURGE ,PPN=IWAIT131 .

ATI'ACH ,TAPE72 ,1 IWAIT131 ,PW=OLIVIA.

SOKl‘MRG.

CATALCB ,TAPEZ , SP76STUDENTRECORIB , ID=BCB ,RP=999 ,TK=OLIVIA .

REWIND,TAPE2.

MAP (OFF)

F'IN.

IGO.

REWIND,TAPE20 .

COPYBF,TAPE20 ,OUI'PUT.

SORT(2 ,1 ,90)

FILE (TAPE15,S,D, ,O ,N)


FILE (TAPEZ ,O,D, ,O,N)

KEY (A ,C,l ,9)

RECORD (I ,U ,90)

END

PRCXSRAM PR (IWSIZ ,OJT'PU'I‘=512 ,TAPE2=512 ,TAPE20=512 ,TAPE42=512 ,

XTAPE4 3)

DATA J,N,I ,IT,IWS*0/,ZA,ZB,ZC,ZD,AB,AR,BR,DR,ER,BB,DB,EB,SCORE,

XCOUNT,WRMA ,WRMB,WRRA,WRRB/18*0./

88

WRITE(42,1000)

1000 WT(*SW MIMBER* '5X'*M(1) *,5X,*M(1-3) *'SX,*M(4_6) *,5X,*R(1

)

X*,5X,*R(1-3)*,5X,*R(4-6)*,5X,*TOI‘AL PERCENT”)

2 JK=J

IF(JK.ED.0)GO TO 1002

IF(Im.NE.0)GO TO 99

906 WRMA=WRMA+A

WRMB=WRMB+B

WRRA=WRRA+D

WRRB=WRIB+E

IF(J.NE.1)GO TO 400

IF(WEMB.EQ.0)O) TO 201

MPER=WR4A/WHNB*1 00 . +. 5

203 IF(WRIB.EQ.0)GO TO 205

IRPER=WRRA/WRRB*100.+.5

(I) TO 400

201 MPER=0

GO TO 203

205 IRPER=0

400 IF(J.GT.3)GO TO 800

IF(WRBB.EQ.0)GO TO 401

MMR=WR4A/WRMB*100.+.5

403 IF (WRRB.EQ.0)CD TO 405

IRRPER=WRRA/WRRB*1 I0 . +. 5

GO TO 1002

401 MMPER=I

GO TO 403

405 IRRPER=0

GO TO 1002

800 AR=AR+A

BR=BR+B

DR=DR+D

ER=ER+E

IF(BR.EX).0)GO TO 901

LAPERM=AR/BR*100.+.5

903 IF(ER.EQ.0)GO TO 905

LAPERR=DR/ER*100.+.5

GO TO 1002

901 LAPERM=0

GO TO 903

905 LAPERR=0

GO TO 1002

99 IF(CONI'.NE.1)GO TO 980

IPER=0

GO TO 1005

980 IPER=((SO)RE-K)/((COJNT-l.)*15.))*100.+.5

1005 WRITE(42,98) IA,MPER,MMPER,LAPERM,IRPER,IRRPER,LAPERR,IPER

98 FORMAT(4X,I6,8X,I4,6X,I4,7X,I4,6X,I4,6X,I4,7X,I4,10X,I4)

WRITE(43,67) IA,MPER,W1PER,LAPERM,IRPER,IRRPER,LAPERR,IPER

67 FORMAT(16,I4,I4,I4,I4,I4,I4,I4)

IF(IT.EQ.1)GO TO 100

Im=0

89

SCDRE=K

OJUNT=1.

WRMA=0.

WRMB=0.

W0.

WRRB=0.

AR=0.

BR=0.

Dk0.

BR=0.

GO TO 906

1002 IA=I

READ(2,1)I,N,K,KA,KB,KC,A,B,C,D,E,F

1 EOH‘IAT(I6,I3,12,A5,A5,A5,F2.0,F2.0,F3.0,F2.0,F2.0,F3.0)

IF (EOF(2) .NE.0)GO TO 200

SCDRE=SCDRE+K

COINT=COJNT+L

IAA=A

IBB=B

ICC=C

IDD=D

IEE=E

IFF=F

J=N/100

GO TO 51

200 IT=1

GO TO 10

51 IF(IA.EQ.0)GO TO 11

IF(IA.NE.I)GO TO 10

IF(JK.NE.J)GO TO 8

11 AB=AB+A

BB=BB+B

DB=IB+D

EB=EB+E

WRITE(20,26)I,N,K,KA,KB,KC,IAA,IBB,ICC,IO),IEE,IFF

26 FCRMAT(1X,I6,1X,I3,1X,12,1X,A5,A5,A5,I2,12,I3,12,12,I3)

GO TO 2

8 IF(BB.EQ.0.)GO TO 74

APA=AB/BB*100 .

(I) TO 22

74 APA=0.

22 IF(EB.EQ.0.)CD TO 77

APB=[B/EB*100.

GO TO 33

77 APB=0.

33 IPB=APB+.5

IPA=APA+.5

LA=AB

=BB

LD=DB

LE=EB

WRITE(2I,27)JK,LA,LB,IPA,LD,LE,IPB

27 FORMAT(1H+,45X,*THE TOTALS FOR EXAM*,I3,* ARE*,I4,I4,IS,I4,I4,IS)

90

ZA=ZA+AB

ZB=ZB+BB

ZC=ZC+lB

ZD=ZD+EB

WRITE(20,88)I,N,K,KA,KB,KC,IAA,IBB,ICC,III),IEE,IFF

88 FORMAT(1X,I6,1X,I3,1X,I2,1X,A5,A5,A5,I2,12,I3,12,I2,I3)

AB=0.

BB=0.

BB=0.

BB=0.

AB=AB+A

BkBBi-B

DB=DB+D

EB=EB+E

GO TO 2

10 IF(BB.EQ.0.)(I) TO 75

APA=AB/BB*1 00 .

GO TO 23

75 APA=0.

23 IF(EB.EQ.0.)GO TO 76

APB=DB/EB*100.

GO TO 34

76 APB=0.

34 IPA=APA+.5

IPB=APB+.5

LA=AB

LB=BB

LD=DB

LE=EB

WRITE(20,28)JK,LA,LB,IPA,LD,LE,IPB

28 FORMAT(1H+,45X,*THE TOTALS FOR EXAM*,I3,* ARE*,I4,I4,I5,I4,I4,IS)

ZA=ZA+AB

ZB=ZB+BB

ZC=ZC+DB

ZD=ZD+EB

IF(ZB.EQ.0)GO TO 105

ZPA=ZA/ZB*100 .

GO TO 106

105 ZPA=0.

106 IF(ZD.EQ.0.)GO TO 1I7

ZPB=ZC/ZD*100 .

GO TO 108

107 ZP$0.

108 MV=ZA

MX=ZB

MY=ZC

MZ=ZD

IZPA=ZPA

IZPB=ZPB

Im=l

WRITE(20,38) MV,MX,IZPA,MY,MZ ,IZPB

38 FORMAT(46X,*THE GRAND TOTALS ARE*,6X,I4,I4,IS,I4,I4,IS,/)

IF(IT.EQ.1)GO TO 2

1'.

.r‘rid"‘31

91

ZA=0.

ZB=0 .

ZC=0 .

ZD=0 .

AB=0.

BB=0.

DB=0 .

EB=0.

GO TO 11

1I0 CONTINUE

END

READY 00.06.32

Program FACTOR

Program FACIOR performs a factor analysis on specified exams and

produces a matrix of inter-item correlation coefficients.

HAL,SPSS,D=X.

REWIND,HIIIIJT.

MAP(OFF)

FTN.

LO).

REWIND,TAPE6.

COPYSBF ,‘MPE6 ,OJT‘PUT.

RUN NAME FACTOR ANALYSIS FOR CEM 130 EXAM 217

DATA LIST FIXED /1 STUNUM 1-6,EFN 7-9,S 10-11,Ql TO 015 12-26 (A)/

SELECT IF (EFN ED 217)

N OF CASES UNKNOAN

ng Q]. m 015 (usfl'lAfl,IBI'ICIO'IDI'IIEI'OIFII'IIGII'IHII'IIIII'IJUI=0)

(CONVERT)

FREQJENCIES INTEGERfll TO 015 (0,1)

OPTIONS 8

STATISTICS 1,5

READ INPUT DATA

FACTOR VARIABLES=QI TO 01 5/T'YPE=PA2/FACSCORE/NFAC'IOFB = 3/

OPTIONS 5

STATISTICS ALL

FINISH

PRCBRAM A (INPUT=64 ,OJTPUT‘=112,m,TAPES=BCDOJT,TAPE6=112)

DIMENSION B(15,15) ,C(15,15)

SD=0

XS=0

WRITE(6,3)

3 FORMAT(52X,*X VALUES* ,17X,*Z VALUES*,//)

DO 50 I=1,15

50 READ(5,6) (B(I,J) ,J=1,15)

6 EORMAT(8F10.7)

DO 5 NL=1,15

DO 5 NA=1,15

92

DO 5 NL=1,15

DO 5 NA=1,15

IF(NA.EQ.NL)GO TO 5

XS=XS+B (NL ,NA)

5 CONTINUE

XA=XS/225.

DO 8 NLF1,15

DO 8 NA=1,15

IF(NA.EQ.NL)GO TO 8

C(NL,NA) =.5*AL£I;( (1+B(NL,NA) )/(1-B (NL,NA) ))

8 CONTIMJE

DO 9 NLF1,15

DO 9 NA=1,15

IF(NA.EQ.NL)GO TO 9

WRITE(6,2)B(NL,NA) ,C(NL,NA)

2 FORMAT(50X,F10.7,15X,F10.7)

9 CONTINUE

WRITE(6,10)XA

10 FORMAT(//,* THE AVERAGE IS*,F10.7)

EIND

READY 00.17.42

Program ANOVA

Program ANOVA performs a one-way and a two~way analysis of variance

as well as posthoc analyses on the average correlation coefficients.

HAL,SPSS.

RUN NAME ANOVA AND ONEWAY VISUALIZATION VS DUN-VISUALIZATION

DATA LIST FIXED /1 EXAM 1-4,ZRBAR 6-9,@ 11 ,CL 13,W\YONE 15/

N OF CASES 24

ANOVA ZRBAR BY CO (1,2) CL (1,2)/

READ INPUT DATA

ONEWXY ZRBAR BY WAYONE (l,4)/

RABBES = TUKEY/

RAMSES = SCHEFFE (.05)/

STATISTICS ALL

FINISH

APPENDIX B

SAMPLE QUESTION CLASSIFICATIONS

To help clarify the method of characterizing questions,

a series of sample questions and their classifications are

given below.

1. "An empty aluminum Coke can weighs 50 grams. How

many moles of aluminum does one Coke can contain?

(Atomic weight A1=27)"

Ans. 1.85 moles

Classification: R1(M1) The relationship being

applied is 27 grams A1 = 1 mole. The student

must find the number of moles in 50 grams by

setting up and solving a linear equation.

2. "Elements which are most metallic are found in

what general area of the periodic chart?"

Ans. Lower left.

Classification: Mr This question is included

because it can easily be interpreted in two

different ways. We could say that a property

of the most metallic elements is that they

are located in the lower left on the periodic

chart. An alternative interpretation is that

there is a relationship between the metallic

character of an element and its position on

the periodic chart. By convention, the mem-

orized information is interpreted as a rela-

tionship and the question is classified as

Mr,

3. "Fifteen grams of nitric oxide (NO) contain how

many molecules?"

Ans. 3.0 x 1023

C1a331f1cation Rs (TmC,R2(M ))

1e

93

94

The R2 implies two relationships being applied to the

problem. The student must realize that for N0, 1 mole =

20 grams. This step is Tmc, since the chemical symbol N0

is translated to a mathematical relationship. At this point

the relationship is used to find the number of moles in

15 gram R1(M1) and finally the memorized relationship 1

mole = 6.02 x 1023 particles is used to find the number of

molecules. (R1(Mle)). The e designates the use of a

number written in scientific notation. Whenever the results

of an R process are used in a subsequent R process, the

two can be combined into one R2 process. The RS designation

is used because the question requires the sequencing of

the translation and reasoning steps.

4. "What is the percentage by weight of fluorine in

phosphorus (III) fluoride" (PF3)?

Ans. 65

Classification R3 (Tmc, R2(M1))

The question is given the classification R3 because in its

associated instructional setting the step by step procedure

for solving these types of problems is given to the student.

The solution then becomes a matter of following the direc-

tions given. Within the algorithm, the student is in-

structed to translate from the chemical symbol to the

mathematical relationship between the number of constituent

atoms in a molecule. The number of atoms is then converted

to the weight of the atoms which is then expressed as a

percentage (R2(M1)).

APPENDIX C

CALCULATING GROUPED AVERAGE CORRELATION COEFFICIENTS

The following example is included to help clarify the

procedure for calculating the average correlation co-

efficients for the groups of questions A, B, C, D.

In this example,a 15 item test is analyzed for the

reasoning category. The content of the test is divided into

two topics, T1 and T2. The first step is to identify questions

containing a reasoning process. In this case they are:

l. Reasoning Questions: 2, 4, 6, 8, 12, 13, 14, 15

Questions which do not require reasoning are therefore:

Nonreasoning Questions: 1, 3, 5, 7, 9, 10, 11

2. The questions are also classified by topic.

Topic 1 Questions: 1, 2, 4, 5, 10, ll, 14, 15

Topic 2 Questions: 3, 6, 7, 8, 9, 12, 13

3. Groups of correlation coefficients are formed as

follows:

Group A is all coefficients between questions from

the same content which require reasoning.

That is, Group A = {r(K1Tx, KlTx)}

where Kl = kind of thinking 1, which in

this case is reasoning;

and r = the correlation coefficient operator.

Group A coefficients are:

95

96

r r r

2.4 r2,15 ”4,15 8,12 6,12 ”12,13

r2.14 r4,14 ”8,6 ”8,13 ”6,13 ”14,15

The values of the above correlation coefficients are

then assigned to obtain an average correlation for

Group A.

Group B is all coefficients between pairs of questions

from the same topic with only one of the pair being a

reasoning question.

That is, Group B = {r(K1Tx, KyTx)} y # 1

Group B coefficients are:

”1,2 ”2,5 ”4,10 ”10,14 ”3,6 ”3,13 ”7,12

”1,4 ”2,10 ”4,10 ”10,15 ”3,8 ”6,7 ”7,13

”1,14 ”2,11 ”5,14 ”11,14 ”7,8 r6,9 ”8,9

”1,15 ”4,5 ”5,15 ”11,15 ”3,12 ”7,8 ”9,12 ”9,13

Group C is all coefficients between pairs of questions

both of which require reasoning but each of the pair

being from a different topic.

That is, Group C = {r(K1 Tx'KlTy)} x # y

Group C coefficients are:

r r

”2,8 ”2,6 2,12 ”4,12 ”8,14 6,14 ”12,14 r13,14

”2,13 ”4,8 ”4,6 ”4,13 ”8,15 ”6,15 r12,15 ”13,15

Group D is all coefficients between pairs of questions,

only one of which requires reasoning, with each of

97

the pair being from a different topic.

That is, Group D = {r(K1Tx,KyTz)} x i z, y # 1

Group D coefficients are:

”2,3 ”4,7 ”8,10 ”6,10 ”12,10 ”13,10

”2,7 ”4,9 ”8,11 ”6,11 ”12,11 ”13,11

”2,9 ”8,1 ”6,1 ”12,1 ”13,1

”4,3 ”8,5 ”6,5 ”12,5 ”13,5

In this manner, one test would yield four average cor-

relation coefficients, one from each of the groups A, B,

C and D. A set of tests analyzed for the reasoning cate-

gory would yield a set of average correlation coefficients

for each group. These sets of coefficients are then used

as the data for the analysis of variance.

The specific topics which were chosen for this analysis

are listed below.

1. CEM 130 Exam 2

Topic 1 - Crystal Structure

Topic 2 - Electromagnetic Radiation

Topic 3 - Structure Determination in Crystals.

2. CEM 130 Exam 3

Topic 1 - Particles and Waves

Topic 2 - Emission Spectroscopy

Topic 3 - Quantum Numbers

98

CEM 131 Exam 1

Topic 1 - Ideal Gases

TOpic 2 - Phase Transformations

CEM 131 Exam 2

Topic 1 - The Equilibrium Constant

Topic 2 - Calculations Based upon the Equilibrium

Law

CEM 131 Exam 3

Topic 1 - Solutions

Topic 2 - Concentration and Colligative Properties

Topic 3 - Ionic Equilibria

APPENDIX D

CLIC TAPE OUTLINES

CLIC Tape A-l

1. Memory Skills

A. The importance of memorization

B. Note taking

1. Noting definitions and examples

2. Previewing study guide questions

3.) The 2-5-1 format

C. Reviewing your notes

1. Cueing

2. Establishing memory traces

3. Association and understanding

4. Repression - developing a good attitude

5. Self confidence

6. Timing your review

II. Application to Chemistry Concepts

A. Vapor pressure

1. The "gas can” example

2. Factors effecting vapor pressure

B. Cooling curves

1. A related experiment

2. Heat capacity.

CLIC Tape A-2

I. Review of Memory Skills

A. Lecture cueing

99

II.

II.

100

B. Examples

C. Note taking and studying

Application to Concepts of Chemistry

A. Irreversible Processes

1. Definitions: reversible, irreversible

2. Examples

B. Equilibria

1. Reaction rates

2. The equilibrium law

3. LeChatelier's Principle

a. Changing concentration

b. Changing pressure

c. Changing temperature

CLIC Tape A-3

Review of Memory Skills

A. Using the study guide

B. Lecture cues - examples

C. Studying and repression

Chemistry Concepts

A. Solutions and Mixtures

B. Concentration terms

1. Normality

2. Molarity

3. Molality

4. Weight percent

5. Saturated

6. Supersaturated

II.

101

Factors which effect solubility

1. Charge density - charge to radius ratio

2. Temperature

3. Pressure

Colligative properties

Electrolytes

CLIC Tape B-l

Developing Reasoning With Math Skills

A. Symbolic equations

1. Properties represented

2. Units of the variables - units conversion

3. Manipulating symbolic equations

4. Using two symbolic equations in sequence

5. Checking your mathematics

B. A General approach to problem solving

1. Reading the problem, noting givens and unknowns.

2. Applying relationships to the problem

3. Setting up the solution

4. Checking the units and the math

Applications

A. Ideal Gas Law calculations

1. PV = nRT

2. Boyles and Charles Law problems

B. Dimensional analysis applied to specific heat

problems

II.

II.

102

CLIC Tape B-2

Problem Solving Principles

A. Using symbolic equations in problem solving

1. Symbol-variables

2. Units-unit conversion

3. Manipulating the equation

B. Developing a problem solving approach

C. Using dimensional analysis

Applications

A. The Equilibrium Law

1. The symbolic equation

2. Variables and units

3. Working with initial concentrations

CLIC Tape B-3

Review of Symbolic Equations

A. Variables and units

B. Deriving new equations

C. Unit conversions

D. Dimensional analysis

Applications

A. Colligative Properties

1. Freezing point depression

2. Boiling point elevation

B. Weight percent problems

C. Concentration problems and dimensional analysis

1.

2.

Molarity

Normality

103

‘-fl-Irv

3"

MICHIGAN STATE UNIV. LIBRARIES

llHI‘WIHHIIWI”WWI!WHH‘IIIH‘IHHIIW31293010796542

Documents

THE DEVELOPMENT AND EVALUATION OF A DIAGNOSTIC …