6
© Division of Chemical Education •  www.JCE.DivCHED.org •  Vol. 86 No. 7 July 2009  •  Journal of Chemical Education 827 In the Classroom Students routinely request tools to help them study; this is particularly associated with high-stakes assessments such as final exams. Many instructors are happy to supply study tools, including copies of old examinations or other practice exams. As research in cognition provides more clues about how people learn, the possibility of improved study tools has emerged. One productive approach to look at enhancing the utility of practice exams as a means to help students understand their studying needs is to place such exams within the context of cognitive load theory. Cognitive Load and Metacognition Cognitive load may be described as the amount of mental activity imposed on the working memory at any instant in time. is concept is arguably descendent from the seminal paper by Miller in 1956 (1) that proposed that the human cognitive system—via short-term memory—can actively process 7 ± 2 pieces of information at any time. Baddeley and co-workers (2) have expanded upon this observation in terms of the limitations of the working memory. While direct measures of cognitive load have been found to be a challenge, the studies of Gopher and Braune (3) have been reviewed and indicate that (4) [S]ubjects can subjectively evaluate their cognitive processes, and have no difficulty in assigning numerical values to the imposed mental workload or the invested mental effort in a task. Although mental effort has been measured using various techniques (5), including rating scales, physiological techniques (i.e., heart-rate variability and pupil dilation response [6] ), and dual-task methods, subjective measures have been found to be very reliable, unobtrusive, and very sensitive (7–11). “e intensity of effort expended by students is oſten considered the essence of cog- nitive load” (9), and most studies for measuring mental effort have used a subjective rating system (12). us, as an a priori estimate of cognitive load, while working on practice exams, students can be asked to introspect on their invested mental effort. When students rate their own mental effort in this manner, they are thinking about their own thinking and engaging in metacognition. Metacognition can be described as awareness of one’s own knowledge and control over cognition (13). Metacognition has been noted to include anything related to the planning, monitor- ing, and altering use of one’s mental resources in relation to achiev- ing a particular goal (14). Development of classroom tools that help learners develop and master their metacognitive abilities has been described in the science education literature (15, 16). Combined measures of performance and mental effort can provide valuable information about the intersection between content knowledge and cognitive resource usage. According to the efficiency view (6), learners’ behaviors in a particular learn- ing condition are found to be more efficient if performance is higher than might be expected on the basis of their invested mental effort (3). Furthermore, by making information about student mental effort available to students within a practice test environment, students receive valuable feedback about accuracy of content knowledge and their own cognitive efficiency, which may help them better monitor their own learning. Additionally, aggrega- tion of data from all students in a class can inform an instructor about topics that are going well and others where instructional changes might be helpful. Instrument Design e practice exam instrument used in our study contained 50 multiple choice items, including several specifically designed to assess materials science and nanoscience. Aſter each exam item, a mental effort item was inserted into the exam format that asked students to introspect on the degree of mental effort expended in the previous question answered (Figure 1). We used a five-point scale consistent with the number of available multiple-choice options found on a typical Scantron answer key. e numbering of inserted mental effort items was consistent with the adjacent nature of columns found on the Scantron an- swer sheet and questions were answered in the following order: 1, 51, 2, 52, 3, 53, etc. (Figure 2). Designing Chemistry Practice Exams for Enhanced Benefits An Instrument for Comparing Performance and Mental Effort Measures Karen J. Knaus, Kristen L. Murphy, and Thomas A. Holme* Department of Chemistry and Biochemistry, University of Wisconsin–Milwaukee, Milwaukee, WI 53211; *[email protected] Resources for Student Assessment Current address: Department of Chemistry, University of Colorado–Denver, Denver, CO 80217. Current address: Department of Chemistry, Iowa State University, Ames, IA 50011. 51. How much mental effort did you expend on question 1? (A) Very little (B) Little (C) Moderate amounts (D) Large amounts (E) Very large amounts Figure 1. Example of a mental effort item inserted into the practice exam format. Item answer Mental effort rating Figure 2. Top portion of the answer key for Scantron sheet used for collecting student answers and mental effort ratings.

Designing Chemistry Practice Exams for Enhanced Benefits. An Instrument for Comparing Performance and Mental Effort Measures

Embed Size (px)

Citation preview

Page 1: Designing Chemistry Practice Exams for Enhanced Benefits. An Instrument for Comparing Performance and Mental Effort Measures

© Division of Chemical Education  • www.JCE.DivCHED.org  •  Vol. 86 No. 7 July 2009  •  Journal of Chemical Education 827

In the Classroom

Students routinely request tools to help them study; this is particularly associated with high-stakes assessments such as final exams. Many instructors are happy to supply study tools, including copies of old examinations or other practice exams. As research in cognition provides more clues about how people learn, the possibility of improved study tools has emerged. One productive approach to look at enhancing the utility of practice exams as a means to help students understand their studying needs is to place such exams within the context of cognitive load theory.

Cognitive Load and MetacognitionCognitive load may be described as the amount of mental

activity imposed on the working memory at any instant in time. This concept is arguably descendent from the seminal paper by Miller in 1956 (1) that proposed that the human cognitive system—via short-term memory—can actively process 7 ± 2 pieces of information at any time. Baddeley and co-workers (2) have expanded upon this observation in terms of the limitations of the working memory. While direct measures of cognitive load have been found to be a challenge, the studies of Gopher and Braune (3) have been reviewed and indicate that (4)

[S]ubjects can subjectively evaluate their cognitive processes, and have no difficulty in assigning numerical values to the imposed mental workload or the invested mental effort in a task.

Although mental effort has been measured using various techniques (5), including rating scales, physiological techniques (i.e., heart-rate variability and pupil dilation response [6] ), and dual-task methods, subjective measures have been found to be very reliable, unobtrusive, and very sensitive (7–11). “The intensity of effort expended by students is often considered the essence of cog-nitive load” (9), and most studies for measuring mental effort have used a subjective rating system (12). Thus, as an a priori estimate of cognitive load, while working on practice exams, students can be asked to introspect on their invested mental effort. When students rate their own mental effort in this manner, they are thinking about their own thinking and engaging in metacognition.

Metacognition can be described as awareness of one’s own knowledge and control over cognition (13). Metacognition has been noted to include anything related to the planning, monitor-ing, and altering use of one’s mental resources in relation to achiev-ing a particular goal (14). Development of classroom tools that help learners develop and master their metacognitive abilities has been described in the science education literature (15, 16).

Combined measures of performance and mental effort can provide valuable information about the intersection between content knowledge and cognitive resource usage. According to the efficiency view (6), learners’ behaviors in a particular learn-

ing condition are found to be more efficient if performance is higher than might be expected on the basis of their invested mental effort (3).

Furthermore, by making information about student mental effort available to students within a practice test environment, students receive valuable feedback about accuracy of content knowledge and their own cognitive efficiency, which may help them better monitor their own learning. Additionally, aggrega-tion of data from all students in a class can inform an instructor about topics that are going well and others where instructional changes might be helpful.

Instrument DesignThe practice exam instrument used in our study contained

50 multiple choice items, including several specifically designed to assess materials science and nanoscience. After each exam item, a mental effort item was inserted into the exam format that asked students to introspect on the degree of mental effort expended in the previous question answered (Figure 1). We used a five-point scale consistent with the number of available multiple-choice options found on a typical Scantron answer key. The numbering of inserted mental effort items was consistent with the adjacent nature of columns found on the Scantron an-swer sheet and questions were answered in the following order: 1, 51, 2, 52, 3, 53, etc. (Figure 2).

Designing Chemistry Practice Exams for Enhanced BenefitsAn Instrument for Comparing Performance and Mental Effort MeasuresKaren J. Knaus,† Kristen L. Murphy, and Thomas A. Holme*‡

Department of Chemistry and Biochemistry, University of Wisconsin–Milwaukee, Milwaukee, WI 53211; *[email protected]

Resources for Student Assessment

†Current address: Department of Chemistry, University of Colorado–Denver, Denver, CO 80217.

‡Current address: Department of Chemistry, Iowa State University, Ames, IA 50011.

51. How much mental effort did you expend on question 1?

(A) Very little(B) Little(C) Moderate amounts(D) Large amounts(E) Very large amounts

Figure 1. Example of a mental effort item inserted into the practice exam format.

Item answer Mental effortrating

Figure 2. Top portion of the answer key for Scantron sheet used for collecting student answers and mental effort ratings.

Page 2: Designing Chemistry Practice Exams for Enhanced Benefits. An Instrument for Comparing Performance and Mental Effort Measures

828 Journal of Chemical Education  •  Vol. 86 No. 7 July 2009  • www.JCE.DivCHED.org  •  © Division of Chemical Education

In the Classroom

Exam Administration and Feedback to Student

The practice exam was given approximately one week prior to final examinations at an urban university in the U. S. Midwest. Data described in this section arise from a total of 158 students who took the practice exam (83 students in second-semester general chemistry and 75 students in the single-semester pre-engineering chemistry course) and agreed to participate in the research component of the study by signing the institutional review board (IRB) consent form. Content in the two general chemistry courses was similar, although the single-semester pre-engineering chemistry course was a survey course with engineer-ing applications and did not cover the content in as much depth as two semesters of general chemistry. Students were informed that they would receive their practice exam results within a day. Each student was sent a report via e-mail listing their percentage correct and mental effort for each of the 12 chemistry content areas (Figure 3). The use of percentage does not imply a predic-tion for the student performance on larger numbers of items, it is merely a way to report performance in a compact way.

This format is being developed as an online study tool for students from the ACS Exams Institute. When that format is implemented, the specifics of numbering questions would be alleviated because responses will be on a computer screen, and collected automatically.

Results and Discussion

Data Analysis for Individual Student Reports

All 50 exam items were grouped into one of 12 different chemistry content categories as depicted in Figure 3. Perfor-mance and mental effort data were analyzed using a conventional spreadsheet program to determine an average performance score and average mental effort score in each of the 12 chemistry con-tent categories for each student who took the practice exam.

The manner in which this information is conveyed uses a variation of Paas and van Merriënboer’s instrument for deter-mining instructional efficiency (4). Our instrument shows a student’s cognitive efficiency in each of the 12 different chem-

istry content categories (Figure 4) by displaying a scatterplot with the performance index on the vertical axis and the mental effort index on the horizontal axis. For this display both of these factors are displayed as the Z score. While this manipulation is not particularly important on an individual student-level, it provides normalization for aggregated data, as discussed in the next section.

It is important to note that the use of statistical methods with this form of data has been well established over the past 30 years in social science research. Studies have found that when a Likert-type scale consists of five or more categories, treating such data with classical statistics is acceptable (17, 18). Essentially, these data are characterized as quasi-interval, which has been shown to be treated as somewhere between ordinal and a true interval scale (19). This is a special class of interval scale in which numerical values are assigned to each of the equal-appearing intervals. With respect to our mental effort scale, the difference between a rating of “very little” (numerical value = 1) and “little” (numerical value = 2) or a rating of “large amounts” (numerical value = 4) and “very large amounts” (numerical value = 5), at least in terms of human dimensions, can be considered scales that are equal appearing, thus warranting the treatment of the data using the statistical methods noted. Care is taken in the instrument described here to make no inference that compares specific values for this quasi-interval scale; rather, the math-ematical treatment serves only to put the information along an easily understood numerical axis.

In principle, it is possible to alleviate the potential concern that arises from using simple statistical treatment with methods such as item response theory (IRT) to rescale ordinal data into interval data (20). Because the primary use of the mental effort data in this instrument is to disperse information along a scale, rather than to compare any given number to another, it is argu-able that a simpler mathematical treatment is more appropriate. In particular, the utility to students would be adversely affected if the scale of the mental effort would need significant explana-tion, for example.

It is also important to mention that other methods for determining “cognitive efficiency” exist in the literature. Dif-

Performance

Categories Score (%)

1 States of matter 50

2 Stoichiometry 100

3 Atomic structure and periodicity 50

4 Molecular structure 67

5 Acid–base and ionic equilibrium 25

6 Non-ionic chemical equilibrium 0

7 Kinetics 100

8 Thermodynamics 20

9 Electrochemistry 20

10 Descriptive chemistry 75

11 Solutions 33

12 Materials and nanoscience 60

Total 48

Mental Effort

Categories Reported Mental Effort (Av., 1–5)

1 States of matter 3.5

2 Stoichiometry 3.7

3 Atomic structure and periodicity 2.5

4 Molecular structure 3.2

5 Acid–base and ionic equilibrium 2.0

6 Non-ionic chemical equilibrium 3.7

7 Kinetics 4.5

8 Thermodynamics 3.6

9 Electrochemistry 3.0

10 Descriptive chemistry 2.0

11 Solutions 3.3

12 Materials and nanoscience 2.8

Total 3.1

Figure 3. Example of practice exam report sent to each student by email. The total represents the average of all categories.

Page 3: Designing Chemistry Practice Exams for Enhanced Benefits. An Instrument for Comparing Performance and Mental Effort Measures

© Division of Chemical Education  • www.JCE.DivCHED.org  •  Vol. 86 No. 7 July 2009  •  Journal of Chemical Education 829

In the Classroom

fering from what we have accomplished, these other studies include, but are not limited to: brain imaging studies (f MRI) in combination with speeded-processing tasks (21); self-reports using a cognitive failure questionnaire after administration of various cognitive tasks (22); and measurement of reaction time to process emotionally related information (23).

In Figure 4 it can be seen that there are only three chemistry categories—descriptive chemistry, special materials, and atomic structure and periodicity—that are found in the upper-left quadrant of the cognitive efficiency graph (high performance correlated with low mental effort) for this example student. These areas could be identified by the student as areas where a relatively high level of knowledge mastery has been obtained and very little additional study may be necessary in preparing for the final exam.

Content areas in the upper-right quadrant of the cognitive efficiency graph—including, in this specific case, molecular structure, kinetics, and stoichiometry—are areas in which high performance is correlated with high mental effort. The results in Figure 4 suggest that additional practice in these content areas might help students develop more efficient strategies to solve problems or answer questions, and thus increase their cognitive efficiency in these areas. For timed tests, providing efficiency information (i.e., performance in combination with mental effort data) acts as a source of enriched feedback for a student (including both a content performance component as well as a cognitive resource usage component), which may promote development of more efficient strategies for studying—and notably, would not have been revealed with a traditional, content-only practice exam.

Chemistry content areas in the lower-left quadrant of the cognitive efficiency graph present chemistry categories in which low performance is correlated with low mental effort. For this example, these areas include: acid, base, and ionic equilibrium and electrochemistry. It may be important to note that these top-ics were covered late in the semester, so these results may simply indicate that the student did not have enough time to fully learn the material. Another speculation is that these were content areas in which the student recognized a knowledge deficit and therefore felt little motivation in investing large amounts of cognitive resources for study, because high performance seemed improbable. If a student only has one week before the final exam, these areas represent content areas where significant study time may be needed immediately.

The content categories in which low performance with a high amount of mental effort (lower-right quadrant) are found included non-ionic chemical equilibrium, thermodynamics, and solutions. This information indicates to the student that despite expending considerable effort on questions in these content categories, he or she did not perform well. Content topics in this quadrant are likely to be ones in which students will need outside interventions to correct the deficiencies, such as help from tutoring centers, the instructor, or the teaching assistant for the course. Students can be coached to identify content areas of weakness from the cognitive efficiency graph, and the results can potentially eliminate the all-too-common office visit when a student tells the instructor that he or she does not know where to start in seeking help. This analysis is available to the instructor as well and could help direct efforts when an instructor does not know where to start to help a student who is deficient in several chemistry content areas.

A key difference between this form of practice test and traditional practice exams, therefore, arises from the fact that it prompts students to think about their own thinking (i.e., meta-cognition). An important aspect of metacognition, metacogni-tive monitoring, is related to a person’s ability to judge his or her own performance (24), thus, the enriched feedback students receive from the practice exam (including both performance and mental effort averages for each content area) indeed helps students validate their current levels of both content knowledge and cognition knowledge. This type of development holds the promise to help guide their future monitoring of cognitive resources.

Data Analysis for Chemistry EducatorsIn the same manner as was calculated for individual stu-

dents, additional data analyses can be performed to determine an average performance, average mental effort, and a comparison of these two variables in each of the 12 chemistry content areas for the class as a whole (Figure 5). Again, the performance and men-tal effort data were analyzed using a conventional spreadsheet program. This type of classroom analysis provides beneficial information for chemistry instructors and chemical education researchers from both a content and cognitive perspective. The performance data can help instructors identify chemistry subjects for which more intensive instruction using different formats may be necessary. The added mental effort dimension of the analysis helps instructors identify content areas in which students are experiencing both the least and most amount of

Figure 4. Graph of cognitive efficiency in different chemistry categories for student. Normalized values for student performance in each content category were plotted against normalized values for average mental effort in each category. Values in the upper-left quadrant indicate content areas of high performance and low mental effort, values in the upper-right quadrant indicate content areas of high performance and high mental effort, values in the lower-left quadrant indicate areas of low performance and low mental effort, and values in the lower-right quadrant of the graph indicate areas of low performance and high mental effort.

High Performance/Low Mental Effort

Low Performance/Low Mental Effort

High Performance/High Mental Effort

Mental Effort

Ave

rage

Stu

dent

Per

form

ance

Low Performance/High Mental Effort

−2 −1

1

0

2

−1

−2

0 1 2

acid–base andionic equilibrium electrochemistry

atomic structure & periodicity

solutions

thermodynamics

non-ionicchemical

equilibrium

stoichiometry

molecular structure

kinetics

states ofmatter

descriptivechemistry

special materials

averageperformance/effort

Page 4: Designing Chemistry Practice Exams for Enhanced Benefits. An Instrument for Comparing Performance and Mental Effort Measures

830 Journal of Chemical Education  •  Vol. 86 No. 7 July 2009  • www.JCE.DivCHED.org  •  © Division of Chemical Education

In the Classroom

mental strain. Being able to identify content areas with the highest cognitive load will help instructors determine where new teaching strategies may be the most helpful. Similarly, for chemical education researchers, this type of analysis can help reveal the effectiveness of new instructional methodologies for different chemistry content areas, especially in terms of their cognitive impacts.

As an example, after examining Figure 5, an instructor can learn that although students performed poorly in the categories of electrochemistry, solutions, and thermodynamics, most students invested a substantial amount of mental effort in these areas. As noted earlier, this may be due merely to the relative newness of that material. Although this may not seem like a good use of cognitive resources for a timed exam, the students have expended a significant amount of mental effort for these items. This suggests that the students did not feel any less motivated in thinking and reasoning in these particular content areas, even though their performance was not particularly strong. In prin-ciple, the instructor could hold additional help sessions for these areas in particular, because the practice test noted both a content deficit and, apparently, a motivation for improvement as students were prepared to invest cognitive resources in this area.

Course ComparisonsCognitive efficiency analyses could also be determined

to compare results from different chemistry classes. In this case, because the practice test was administered in two differ-ent types of general chemistry classes, the possibility exists to investigate how differences in these courses are manifested in student performance and cognition. The cognitive efficiency can best be seen when comparing separate cognitive efficiency graphs for the two different types of general chemistry courses (Figures 5–6) where normalized values for performance and mental effort are used.

A comparison of the cognitive efficiency graphs (Figures 5–6) for the two different types of chemistry courses demon-strates differences in terms of the relationship between perfor-mance and utilization of mental resources in different chemistry areas. Both courses (second-semester general chemistry and single-semester pre-engineering chemistry) demonstrated low cognitive efficiency in the solutions category, which is an im-portant observation in general because it is apparently robust regardless of the course emphasis or instructor.

Additionally, differences are observed in the cognitive ef-ficiency of the materials and nanoscience chemistry content area. Content related to these areas was not treated in lecture for the full-year course, though the textbook included examples. For the single-semester pre-engineering chemistry course these topics were explicitly, if briefly, introduced despite the “survey” nature of this course. The cognitive efficiency of this content is higher for students in the pre-engineering section than for students in the full-year course. This observation supports the establishment of a hypothesis that introduction of emerging topics into general chemistry such as materials science and nanoscience may prove effective even if the time invested is modest (no topics receive large allocations of time in the one-semester survey course). The survey nature of this course is also potentially manifest in the low levels of cognitive efficiency for the areas of acid–base and ionic equilibrium, non-ionic chemical equilibrium, and thermodynamics relative to the second-semester general chemistry course.

These observations do not arise from a study with careful research design, so ultimately, they only provide information from which hypotheses might be formed. For example, measure-ment of motivational factors such as perceived task value and goal orientation of students was not determined prior to exam administration in the courses. These types of measurements may prove particularly useful for understanding cognitive efficiency

High Performance/Low Mental Effort

Low Performance/Low Mental Effort

High Performance/High Mental Effort

Mental Effort

Ave

rage

Stu

dent

Per

form

ance

Low Performance/High Mental Effort

−2 −1

1

0

2

−1

−2

0 1 2

acid–base andionic equilibrium

electrochemistry

atomic structure & periodicity

solutions

thermodynamics

non-ionic chemical equilibrium

stoichiometry

molecular structure

kinetics

states ofmatter

descriptivechemistry

special materials

averageperformance/effort

Figure 5. Graph of cognitive efficiency in different chemistry categories for topics taught in a single-semester engineering chemistry course. Normalized values for classroom performance in each chemistry category were plotted against normalized values for average mental effort in each category.

High Performance/Low Mental Effort

Low Performance/Low Mental Effort

High Performance/High Mental Effort

Mental Effort

Ave

rage

Stu

dent

Per

form

ance

Low Performance/High Mental Effort

1

0

2

−1

−2

acid–baseand ionic equilibrium

electrochemistry atomic structure & periodicity

solutions

thermodynamics

non-ionicchemical equilibrium

stoichiometry

molecular structure

kinetics

states ofmatter

descriptivechemistry

special materials

averageperformance/

effort

−2 −1 0 1 2

Figure 6. Graph of cognitive efficiency in different chemistry categories for topics taught in a second-semester general chemistry course (average of two sections). Normalized values for classroom performance in each chemistry category were plotted against normalized values for average mental effort in each category.

Page 5: Designing Chemistry Practice Exams for Enhanced Benefits. An Instrument for Comparing Performance and Mental Effort Measures

© Division of Chemical Education  • www.JCE.DivCHED.org  •  Vol. 86 No. 7 July 2009  •  Journal of Chemical Education 831

In the Classroom

classroom comparisons, especially if carried out at the start of the semester in combination with administration of a practice exam before the final.

Intervention Results: Do Students Really Perform Better?The final component of the data analysis for this research

involved investigation of whether the practice exam instrument served as an effective intervention for helping students improve their performance in the class. Research into the effectiveness of practice exams in performance improvement has generally showed mixed results, with few statistically significant gains (25–27). These previously noted challenges are also observed with this instrument, perhaps because any observed effects are inevitably convoluted with an indeterminate number of study strategies used by students. Nonetheless, it is possible to analyze data to see whether positive effects can be found from various student performance measures.

For this portion of the research the student participants were enrolled in a preparatory chemistry course that used an active learning methodology. All data were taken only from students who provided IRB-approved consent to participate in the overall research study. The practice exam was made available on a volun-tary basis in both the Fall 2007 semester (88 took the practice exam and are called participants, 40 did not take the practice exam and are called nonparticipants) and the Spring 2008 semester (n = 60 participants, n = 81 nonparticipants). To consider the effects of participation in the practice exam, a preintervention class aver-age Z score was calculated for each student (based on the average of three exams taken prior to the students’ receiving the practice exam) and a postintervention final exam Z score (based on average performance on the final exam—which can be further analyzed in terms of performance on an ACS Exams Institute exam, or a locally written exam). These individual Z scores for each student were then used to tabulate a ΔZ score for each student. Because the practice exam was used in two semesters, a number of possible comparisons can be made. The most apparent is to compare per-formance gains between students who participated in the practice and those who did not. Table 1 shows only those results that are statistically significant ( p < 0.05)

During the Spring 2008 semester, the instructor used student response systems (i.e., “clickers”) during in-class active learning exercises using the same format as the practice exam. That is, students first answered a content question and then immediately responded to the mental effort question. In this manner, students were trained to consider their mental effort. Subsequently, when these students participated in the voluntary practice exam they showed greater performance gains on the

course final exam than students who did not participate in the practice exam.

Somewhat discouraging, in Fall 2007 on the ACS Exam portion of the final exam, there was a decrease in performance for the students who participated in the practice exam. Further investigation of this observation by separating students whose performance was above average on the practice exam versus those whose performance was below average was potentially revealing. The students who performed below average on the practice exam had increases in the average ΔZ score, whereas students who performed above average showed decreases in the average ΔZ score. An explanation that is consistent with this observation is that students who do well believe that they need less study and subsequently have lower performance. This observation points to the importance of coaching students on how to use information from practice exams—and specifically to guard against complacency if performance is strong.

In addition to these two instances where statistically signifi-cant differences were observed between groups of students, in other semesters more mixed results were obtained. In particular, the classes from which the data in Figures 4–6 were obtained showed that students who had low performance and low mental effort had relatively large decreases in performance. For these students, it could be speculated that after taking the practice exam some of them may have felt discouraged by the results and, thinking that they had too many deficiencies to overcome in one week, lacked further motivation in preparing for the final. If this was indeed true, it may be useful for instructors to be aware that some students may need additional encouragement not to give up when they do not perform as well on the practice. Thus, the importance of instructor coaching on how to use practice exam results may be important for both high performing and low performing students.

Another plausible hypothesis is that low performance students did not benefit from the practice exam intervention because they did not know how to use results most effectively to develop the best plan of action for studying for the upcom-ing final. Because this possibility exists, an integral component of the effectiveness of the instrument likely lies in training or coaching students on how to use the results effectively to develop the best individualized study strategies. Finally, it is important to recognize that some students are likely to put limited effort into the course, the practice exam, or the final exam. Performance for these students will be found predomi-nantly in the low mental effort–low performance category, and use of the practice exam instrument would most likely prove to have low efficacy.

Table 1. Distribution of Student Scores Comparing Final Exam Performance to Performance on Previous Hour-Long Exams

Performance Gains Expressed as ΔZ Scores (n )

Semester Exam Version Participantsa Nonparticipantsb p Valuec

Spring 2008 Local final exam +0.102 (60)   ‒0.095 (81) 0.035

Fall 2007 ACS final exam   ‒0.136 (88) +0.241 (40) 0.023

Fall 2007 (participants only)

ACS final exam +0.288 (24) (low-performance

participants)

  ‒0.254 (29) (high-performance

participants)

0.015

aStudents who took the practice exam; bStudents who did not take the practice exam; cSignificant at p < 0.05.

Page 6: Designing Chemistry Practice Exams for Enhanced Benefits. An Instrument for Comparing Performance and Mental Effort Measures

832 Journal of Chemical Education  •  Vol. 86 No. 7 July 2009  • www.JCE.DivCHED.org  •  © Division of Chemical Education

In the Classroom

In general, the question of how students use study materials such as practice exams remains an interesting area for research. To date studies have focused on participation (26, 28), including the reward structure for participation (26) as the key variable. The preliminary results here indicate that student characteristics such as level of metacognitive development may be important for determining the performance enhancement associated with participation in practice exams.

Conclusion

The proposed instrument provides several levels of useful information. First and foremost, it provides enhanced study tools for students who are preparing for exams by showing con-tent areas for additional study, and by also summarizing student-reported mental effort. This added measure allows the estimation of cognitive efficiency—a combined measure of performance and mental effort—so that students learn something about the intersection of their content and cognition knowledge. Students receive enriched feedback from the practice exam, and they are able to obtain a global view (at that particular point in time for their study of chemistry) for how their performance in different content areas is related to their cognitive resource usage. This type of formative assessment provides more information than a traditional assessment because it allows students to learn more about their thinking during chemistry exams.

For the instructor in a class, the measure of cognitive ef-ficiency just prior to the final exam represents in some sense a measure of efficacy of teaching approaches. If certain content areas show significantly less cognitive efficiency than others, for example, these may be areas where new instructional approaches might prove useful. Because the cognitive efficiency measure can be derived at the classroom level, it provides several possible windows into instructional efficacy.

Finally, from the perspective of education research, such an instrument may be particularly valuable when used in combina-tion with other instruments that measure, for example, student motivation levels or development of metacognitive abilities. If assessment instruments can have multiple uses—including a tool to help improve metacognitive awareness (29)—to help identify content areas where different pedagogical methods may be needed, and as a means for measuring the effectiveness of new teaching methodologies for different content from the cognitive science perspective, then improvements of how we measure learning can truly develop alongside with improvements in both how we teach and how our students learn chemistry.

Acknowledgments

This work is based on work supported by the National Sci-ence Foundation (NSF) under grant numbers CHE-0407378 and DUE-0618600. Any opinions, findings, and conclusions communicated in this material are those of the authors and do not necessarily reflect the views of the NSF. The authors would also like to thank instructors who allowed for data collection in their classrooms, which contributed to our research findings.

Literature Cited

1. Miller, G. A. Psychol. Rev. 1956, 101, 343–352. 2. Baddeley, A. D.; Della Sala, S. Phil. Trans. Roy. Soc. 1996, 351,

1397–1404.

3. Gopher, D.; Braune, R. Human Factors 1984, 26 (5), 519–532. 4. Paas, F. G. W. C.; Van Merriënboer, J. J. G. Human Factors 1993,

35 (4), 737–743. 5. Eggemeier, F. T. Properties of Workload Assessment Techniques.

In Human Mental Workload, Hancock, P. A., Meshkati, N., Eds.; Elsevier: Amsterdam, 1988; pp 41–62.

6. Ahern, S. K.; Beatty, J. Science 1979, 205, 1289–1292. 7. Paas, F. G. W. C. J. Educ. Psychol. 1992, 84, 429–434. 8. Paas, F. G. W. C.; Van Merriënboer, J. J. G. J. Educ. Psychol. 1994,

86, 122–123. 9. Paas, F. G. W. C.; Van Merriënboer, J. J. G.; Adam, J. J. Percept.

Mot. Skills 1994, 79, 419–430. 10. Ayres, P. Learn. Instr. 2006, 16, 389–400. 11. Hamilton, P. Process Entropy and Cognitive Control: Mental

Load in Internalized Thought Process. In Mental Workload: Its Theory and Measurement, Moray, N., Ed.; Plenum Press: New York, 1979; pp 289–298.

12. Paas, F. G. W. C.; Renkl, A.; Sweller, J. Educ. Psychol. 2003, 38, 1–4.

13. Flavell, J. Am. Psychol. 1979, 34, 906–911. 14. Brown, A. L.; Bransford, J. D.; Ferrara, R. A.; Campione, J. C.

Learning, Remembering, and Understanding. In Handbook of Child Psychology, 4th ed., Flavell, J. H., Markman, E. M., Eds.; Wiley: New York, 1983; pp 77–166.

15. Rickey, D.; Stacy, A.M. J. Chem. Educ. 2000, 77, 915–920. 16. Mazur, E. Peer Instruction: Getting Students To Think in Class. In

The Changing Role of Physics Departments in Modern Universities, Redish, E. F., Rigden, J. S., Eds.; American Institute of Physics: Woodbury, NY, 1997; pp 981–988.

17. Johnson, D. R.; Creech, J. C. Am. Sociol. Rev. 1983, 48, 398–407. 18. Zumbo, B. D.; Zimmerman, D. W. Can. Psychol. 1993, 34,

390–400. 19. Kachigan, S. Multivariate Statistical Analysis: A Conceptual Intro-

duction, 2nd ed.; Radius Press: New York, 1991. 20. Harwell, M. R.; Gatti, G. G. Rev. Educ. Research 2001, 71,

105–131. 21. Rypma, B.; Berger, J. S.; Prabhakaran, V.; Bly, B. M.; Kimberg,

D. Y.; Biswal, B. B.; D’Esposito, M. Neuroimage 2006, 33, 969–979.

22. Hallam, R. S.; McKenna, L.; Shurlock, L. Int. J. Audiol. 2004, 43, 218–226.

23. McGivern, R. F.; Andersen, J.; Byrd, D.; Mutter, K. L.; Reilly, J. Brain Cogn. 2002, 50, 73–89.

24. Maki, R. H.; Shields, M.; Wheeler, A. E.; Zacchilli, T. L. J. Educ. Psychol. 2005, 97, 723–731.

25. Maki, R. H.; Serra, M. J. Educ. Psych. 1992, 84, 200–210. 26. Oliver, R.; William, R. L. J. Behavioral Educ. 2005, 14, 141–

152. 27. Crisp, V.; Sweiry, E; Ahmed, A.; Pollitt, A. Educ. Res. 2008, 50

(1), 95–115. 28. Gretes, J. A.; Green, M. J. Res. Computing Educ. 2000, 33,

46–54. 29. Schraw, G. Instr. Sci. 1998, 26, 113–125.

Supporting JCE Online Materialhttp://www.jce.divched.org/Journal/Issues/2009/Jul/abs827.html

Abstract and keywords

Full text (PDF) Links to cited URLs and JCE articles