28
JOURNAL OF RESEARCH IN SCIENCE TEACHING VOL. 44, NO. 1, PP. 57–84 (2007) Exploring Teachers’ Informal Formative Assessment Practices and Students’ Understanding in the Context of Scientific Inquiry Maria Araceli Ruiz-Primo, Erin Marie Furtak Stanford Education Assessment Laboratory, School of Education, Stanford University, 485 Lasuen Mall, Stanford, California 94305-3096 Received 3 June 2005; Accepted 21 June 2006 Abstract: This study explores teachers’ informal formative assessment practices in three middle school science classrooms. We present a model for examining these practices based on three components of formative assessment ( eliciting, recognizing, and using information) and the three domains linked to scientific inquiry (epistemic frameworks, conceptual structures, and social processes). We describe the informal assessment practices as ESRU cycles—the teacher Elicits a question; the Student responds; the teacher Recognizes the student’s response; and then Uses the information collected to support student learning. By tracking the strategies teachers used in terms of ESRU cycles, we were able to capture differences in assessment practices across the three teachers during the implementation of four investigations of a physical science unit on buoyancy. Furthermore, based on information collected in a three-question embedded assessment administered to assess students’ learning, we linked students’ level of performance to the teachers’ informal assessment practices. We found that the teacher who more frequently used complete ESRU cycles had students with higher performance on the embedded assessment as compared with the other two teachers. We conclude that the ESRU model is a useful way of capturing differences in teachers’ informal assessment practices. Furthermore, the study suggests that effective informal formative assessment practices may be associated with student learning in scientific inquiry classrooms. ß 2006 Wiley Periodicals, Inc. J Res Sci Teach 44: 57–84, 2007 At a time in which accountability testing is in vogue, teachers need to be supported with formative assessment strategies that can help them to support deep student understanding. Formative assessment is defined as assessment for learning and not assessment of learning (Black, 1993). Formative assessment has the potential to directly improve learning because it takes place while instruction is in progress and can serve as a basis for providing timely feedback to increase student learning (Sadler, 1989; Shepard, 2003). Furthermore, strategies that prove to increase student learning should also increase the likelihood that student learning will be reflected Contract grant sponsor: Educational Research and Development Centers Program (Office of Educational Research and Improvement, U.S. Department of Education); Contract grant number: R305B60002. Correspondence to: M. Araceli Ruiz-Primo; E-mail: [email protected] DOI 10.1002/tea.20163 Published online 8 December 2006 in Wiley InterScience (www.interscience.wiley.com). ß 2006 Wiley Periodicals, Inc.

20163_ftp_002

Embed Size (px)

DESCRIPTION

Exploring Teachers’ Informal Formative Assessment Practices and Students’

Citation preview

Page 1: 20163_ftp_002

JOURNAL OF RESEARCH IN SCIENCE TEACHING VOL. 44, NO. 1, PP. 57–84 (2007)

Exploring Teachers’ Informal Formative Assessment Practices and Students’Understanding in the Context of Scientific Inquiry

Maria Araceli Ruiz-Primo, Erin Marie Furtak

Stanford Education Assessment Laboratory, School of Education, Stanford University, 485

Lasuen Mall, Stanford, California 94305-3096

Received 3 June 2005; Accepted 21 June 2006

Abstract: This study explores teachers’ informal formative assessment practices in three middle

school science classrooms. We present a model for examining these practices based on three components of

formative assessment (eliciting, recognizing, and using information) and the three domains linked to

scientific inquiry (epistemic frameworks, conceptual structures, and social processes). We describe the

informal assessment practices as ESRU cycles—the teacher Elicits a question; the Student responds; the

teacher Recognizes the student’s response; and then Uses the information collected to support student

learning. By tracking the strategies teachers used in terms of ESRU cycles, we were able to capture

differences in assessment practices across the three teachers during the implementation of four

investigations of a physical science unit on buoyancy. Furthermore, based on information collected in a

three-question embedded assessment administered to assess students’ learning, we linked students’ level of

performance to the teachers’ informal assessment practices. We found that the teacher who more frequently

used complete ESRU cycles had students with higher performance on the embedded assessment as

compared with the other two teachers. We conclude that the ESRU model is a useful way of capturing

differences in teachers’ informal assessment practices. Furthermore, the study suggests that effective

informal formative assessment practices may be associated with student learning in scientific inquiry

classrooms. � 2006 Wiley Periodicals, Inc. J Res Sci Teach 44: 57–84, 2007

At a time in which accountability testing is in vogue, teachers need to be supported with

formative assessment strategies that can help them to support deep student understanding.

Formative assessment is defined as assessment for learning and not assessment of learning

(Black, 1993). Formative assessment has the potential to directly improve learning because it

takes place while instruction is in progress and can serve as a basis for providing timely feedback

to increase student learning (Sadler, 1989; Shepard, 2003). Furthermore, strategies that prove to

increase student learning should also increase the likelihood that student learning will be reflected

Contract grant sponsor: Educational Research and Development Centers Program (Office of Educational Research

and Improvement, U.S. Department of Education); Contract grant number: R305B60002.

Correspondence to: M. Araceli Ruiz-Primo; E-mail: [email protected]

DOI 10.1002/tea.20163

Published online 8 December 2006 in Wiley InterScience (www.interscience.wiley.com).

� 2006 Wiley Periodicals, Inc.

Page 2: 20163_ftp_002

in well-designed large-scale assessments. However, educational researchers are only beginning to

identify effective formative assessment practices and to directly link these practices to measures of

student learning (Black & Wiliam, 1998).

In this study, we focus on the informal formative assessment practices teachers implement in

their classrooms to gather information about students’ developing understanding during everyday

whole-class conversations. We developed and empirically tested a model of informal formative

assessment based on a pattern that involves teachers eliciting information from students, listening

to the students’ responses, recognizing the information in some way, and acting or using this

information to help students improve their learning. Two main research questions guided the

study: Can the model of informal formative assessment distinguish the quality of informal

assessment practices across teachers? Can the quality of teachers’ informal formative assessment

practices be linked to student performance?

The study focuses on three middle school teachers implementing the first four investigations

of an inquiry-based physical science curriculum. We hypothesized that teachers whose informal

assessment practices were more consistent with the model would have students with higher

performance on three assessment tasks implemented after the fourth investigation.

In what follows we first describe the model of informal assessment in the context of formative

assessment in general, and classroom talk and scientific inquiry in particular. Second, we describe

the methodology we employed to collect and analyze the classroom conversations, and then we

present our findings.

Background

Black (1993) emphasized that formative assessment is essential to effective teaching and

learning. Formative assessment involves gathering, interpreting, and acting on information about

students’ learning so that it may be improved (Bell & Cowie, 2001). That is, information gained

through formative assessment should be used to modify teaching and learning activities to reduce

the gap between desired student performance and observed student performance (Bell & Cowie,

2001; Black & Wiliam, 1998; National Research Council, 1999; Shavelson, Black, Wiliam, &

Coffey, 2003). Student involvement is also essential, as students need to recognize, evaluate, and

react to their own learning and/or others’ assessments of their learning (Bell & Cowie, 2001;

National Research Council, 1999; Zellenmayer, 1989).1

From Formal to Informal Formative Assesssment

Classroom formative assessment can be seen as a continuum determined by the premeditation

of the assessment moment, the formality of means used to make explicit what students know and

can do, and the nature of the action taken by the teacher (the characteristics of the feedback). The

continuum then goes from formal formative assessment in one end to informal formative

assessment on the other (Bell & Cowie, 2001; Shavelson et al., 2003). Each extreme can be

characterized in a different manner. Formal formative assessment usually starts with students

doing/carrying out an activity designed or selected in advance by the teacher so that information

may be more precisely collected (gathering). Typically, formal formative assessments take the

form of curriculum-embedded assessments that focus on some specific aspect of learning (e.g.,

students’ knowledge about why objects sink or float), but they can also be direct questioning,

quizzes, brainstorming, generation of questions, and the like (Bell & Cowie, 2001). The activity

enables teachers to step back at key points during instruction, check student understanding

(interpreting), and plan on the next steps that they must take to move forward their students’

58 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 3: 20163_ftp_002

learning (acting). Teachers plan the implementation of this type of formative assessment at the

beginning, during, or at the end of a unit.

Conversely, informal formative assessment is more improvisational and can take place in any

student–teacher interaction at the whole-class, small-group, or one-on-one level. It can arise out of

any instructional/learning activity at hand, and it is ‘‘embedded and strongly linked to learning

and teaching activities’’ (Bell & Cowie, 2001, p. 86). The information gathered during informal

formative assessment is transient (Bell & Cowie, 2001; e.g., students’ comments, responses, and

students’ questions) and many times goes unrecorded. It can also be nonverbal (based on teacher’s

observations of students during the course of an activity). The time-frame for interpreting and

acting is more immediate when compared with formal formative assessments. A student’s

incorrect response or unexpected question can trigger an assessment event by making a teacher

aware of a student’s misunderstanding. Acting in response to the evidence found is usually quick,

spontaneous, and flexible, because it can take different forms (e.g., responding with a question,

eliciting other points of view from other students, conducting a demonstration when appropriate,

repeating an activity).

Our study focuses on the latter type of formative assessment, the ongoing and informal

assessment that can take place in any student–teacher interaction and that helps teachers acquire

information on a continuing basis. In this context, we use eliciting, recognizing, and using to

describe the teacher’s activities rather than gathering, interpreting, and acting, to more accurately

reflect the model of informal formative assessment. In formal formative assessment, teachers

gather (i.e., collect or bring together) information from all the students in the groups at a planned

time; however, the word eliciting means evoking, educing, bringing out, or developing. To

describe a teacher’s actions as eliciting during informal formative assessment is thus a more

accurate description, as teachers are calling for a reaction, clarification, elaboration, or

explanation from students. During formal formative assessment, teachers have the time to step

back to analyze and interpret the information collected/gathered. Based on this interpretation an

action can be planned (e.g., reteaching a concept). However, during informal formative

assessment, teachers must react on the fly by recognizing whether a student’s response is a

scientifically accepted idea and then use the information from the response in a way that the

general flow of the classroom narrative is not interrupted (e.g., calling students in the class to start a

discussion, shaping students’ ideas). Therefore, we believe that the differences between formal

and informal formative assessment are better conceptualized as a cycle of eliciting, recognizing,

and using. Explanations and examples of formal and informal formative assessment practices are

provided in Table 1. In what follows we contextualize informal assessment in the everyday

classroom talk in a scientific inquiry learning environment.

Classroom Talk

Classroom talk has been established as a legitimate object of study (Edwards & Westgate,

1994). Early research in the process–product tradition focused on the conceptual level of teacher

questions (Redfield & Rousseau, 1981) and other teacher ‘‘behaviors’’ that might be correlated

with measures of student learning (Brophy & Good, 1986). Contemporary studies have placed

teacher–student discourse in context by examining authority structures, the responsiveness of the

teacher-to-student contributions, and patterns in classroom talk (Cazden, 2001; Edwards &

Mercer, 1987; Lemke, 1990; Scott, 1998).

One of these patterns of student–teacher interaction that has become the subject of extensive

discussion has been alternately described as IRE (Initiation, Response, Evaluation) or IRF

(Initiation, Response, Feedback). In this sequence, the teacher initiates a query, a student

INFORMAL ASSESSMENT PRACTICES 59

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 4: 20163_ftp_002

responds, and the teacher provides a form of evaluation or generic feedback to the student

(Cazden, 2001). The IRE and IRF sequences are characterized by the teacher often asking

‘‘inauthentic questions’’ (Nystrand & Gamoran, 1991) in which the answer is already known by

the teacher, sometimes for the sake of making the classroom conversation appear more like a

dialogue than a monologue. Teaching practices that constrain students within IRE/F patterns

have been criticized because they involve students in ‘‘procedural’’ rather than ‘‘authentic’’

engagement (Nystrand & Gamoran, 1991).

Informal Formative Assessment and Classroom Talk: Assessment Conversations

Ongoing formative assessment occurs in a classroom learning environment that helps

teachers acquire information on a continuing and informal basis, such as within the course of daily

classroom talk. This type of classroom talk has been termed an assessment conversation (Duschl,

2003; Duschl & Gitomer, 1997), or an instructional dialogue that embeds assessment into an

activity already occurring in the classroom. Assessment conversations permit teachers to

recognize students’ conceptions, mental models, strategies, language use, or communication

skills, and allow them to use this information to guide instruction. In classroom learning

environments in which assessment conversations take place, the boundaries of curriculum,

instruction, and assessment should blur (Duschl & Gitomer, 1997). For example, an instructional

activity suggested by a curriculum, such as discussion of the results of an investigation, can be

viewed by the teachers as an assessment conversation to find out about how students evaluate the

quality of evidence and how they use evidence in explanations.

Assessment conversations have the three characteristics of informal assessment previously

described: eliciting, recognizing, and using information. Eliciting information employs strategies

that allow students to share and make visible or explicit their understanding as completely as

possible (e.g., sharing their thinking to the class, in overheads or posters). Recognizing students’

Table 1

Differences between formal and informal formative assessment practices

Formal: Designed to provide evidence about students’ learning

Gathering Interpreting Acting

Teacher collects or bringstogether information fromstudents at a planned time.

Teacher takes time to analyzeinformation collected fromstudents.

Teacher plans an action to helpstudents achieve learning goals.

For example, quizzes, embeddedassessments.

For example, reading studentwork from all the students,providing written commentsto all students.

For example, writing or changinglesson plans to address state ofstudent learning.

Informal: Evidence of learning generated during daily activities

Eliciting Recognizing Using

Teacher brings out or developinformation in the form of averbal response from students.

Teacher reacts on the fly byrecognizing students’response and comparing it toaccepted scientific ideas.

Teacher immediately makes use ofthe information from the stu-dents during the course of theongoing classroom narrative.

For example, asking students toformulate explanations or toprovide esssvidence.

For example, repeating or revoi-cing students’ responses.

For example, asking students toelaborate on their response,explaining learning goals,promotes argumentation.

60 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 5: 20163_ftp_002

thinking requires the teacher to judge differences between students’ responses, explanations, or

mental models so the critical dimensions relevant for their learning can be made explicit (e.g.,

teacher compares students’ responses according to the evidence provided or responds to students

by asking which explanation is more scientifically acceptable based on the information provided).

Using information from assessment conversations involves taking action on the basis of student

responses to help students move toward learning goals (e.g., helping students to reach consensus

on a particular explanation based on scientific evidence). The range of student conceptions at

different points of a unit should determine the nature of the conversation; therefore, more than one

iteration of the cycle of eliciting, recognizing, and using may be needed to reach a consensus that

reflects the most complete and appropriate understanding.

Assessment conversations can be thus be described in the context of classroom talk moves

(Christie, 2002) as ESRU cycles—the teacher Elicits a question; the Student responds; the teacher

Recognizes the student’s response; and then Uses the information collected to support student

learning.2 This model of informal formative assessment is depicted in Figure 1.

We distinguish the ESRU model from the IRE/F sequence in three ways: First, the eliciting

questions used to initiate a sequence have the potential for providing information on the evolving

status of student conceptions and understanding about scientific inquiry skills and habits of mind.

For example, teachers can ask questions about understanding key points (e.g., clarification

questions about vague statements or lifting questions to improve the level of discussion). Second,

recognizing students’ responses in diverse ways (e.g., revoicing, rephrasing) is considered

fundamental because it indicates to the student that his/her contribution has been heard and

accepted into the ongoing classroom narrative (Scott, 1998). Recognizing students’ responses also

provides the opportunity for the teacher to pull out the essential aspects of student participation

and to act on them, and also provides the student the opportunity to evaluate the correctness of the

teacher’s interpretation of their contribution.

Third, our conception of using comes from the formative assessment and the scientific inquiry

literature (Black & Wiliam, 1998; Clarke, 2000; Duschl, 2000, 2003; Duschl & Gitomer, 1997;

National Research Council, 2001; Ramaprasad, 1983; Sadler, 1989, 1998). Use, in our sequence,

is more specific than the meaning assigned to feedback in the IRF sequence. Within ‘‘use,’’ a

teacher can provide students with specific information on actions they may take to reach learning

goals: ask another question that challenges or redirects the students’ thinking; model

communication; promote the exploration and contrast of students’ ideas; make connections

between new ideas and familiar ones; recognize a student’s contribution with respect to the topic

under discussion; or increase the difficulty of the task at hand. Clearly, evaluations by themselves

(e.g., ‘‘good’’ or ‘‘excellent’’) cannot be part of our ESRU pattern unless they are embedded in

Figure 1. The ESRU model of informal formative assessment.

INFORMAL ASSESSMENT PRACTICES 61

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 6: 20163_ftp_002

some elaboration on the rationale behind the evaluation provided; that is, helpful feedback.

Defining feedback as a kind of use of formative assessment separates it from more general

interpretations normally included in the IRF sequence.3

Assessment conversations, then, require teachers to be facilitators and mediators of student

learning, rather than providers or evaluators of correct or acceptable answers. In summary,

successful classrooms emphasize not only the management of actions, materials, and behavior, but

also stress the management of reasoning, ideas, and communication (Duschl & Gitomer, 1997).

Assessment Conversations in the Context of Scientific Inquiry

A goal of the present science education reforms is the development of thinking, reasoning,

and problem-solving skills to prepare students in the development and evaluation of scientific

knowledge claims, explanations, models, scientific questions, and experimental designs (Duschl

& Gitomer, 1997). The National Research Council (2001) describes these habits of mind as

fundamental abilities for students as they engage in the learning of science through the process of

inquiry. If students are taught science in the context of inquiry, they will knowwhat they know, how

they know what they know, and why they believe what they know (Duschl, 2003).

Frequent and ongoing assessment activities in the classroom can help achieve these habits of

mind by providing information on the students’ progress toward the goal and helping teachers

incorporate this information to guide their instruction (Black & Wiliam, 1998; Duschl & Gitomer,

1997; Duschl, 2003). Assessment conversations in the context of scientific inquiry should focus on

three integrated domains (Driver, Leach, Millar, & Scott, 1996; Duschl, 2000, 2003): epistemic

frameworks for developing and evaluating scientific reasoning; conceptual structures used when

reasoning scientifically; and social processes that focus on how knowledge is communicated,

represented, and argued. Epistemic structures are the knowledge frameworks that involve the rules

and criteria used to develop and/or judge what counts as scientific (e.g., experiments, hypotheses,

or explanations). Epistemic frameworks emphasize not only the abilities involved in the processes

of science (e.g., observing, hypothesizing, and experimenting, using evidence, logic, and

knowledge to construct explanations), but also the development of the criteria to make judgments

about the products of inquiry (e.g., explanations or any other scientific information). When

students are asked to develop or evaluate an experiment or an explanation, we expect them to use

epistemic frameworks. Conceptual structures involve deep understanding of concepts and

principles as parts of larger scientific conceptual schemes. Scientific inquiry requires knowledge

integration of those concepts and principles that allows students to use that knowledge in an

effective manner in appropriate situations. Social processes refer to the frameworks involved in

students’ scientific communications needed while engaging in scientific inquiry, and can be oral,

written, or pictorial. It involves the syntactic, pragmatic, and semantic structures of scientific

knowledge claims, their accurate presentation and representation, and the use of diverse forms of

discourse and argumentation (Duschl & Grandy, 2005; Lemke, 1990; Marks & Mousley, 1990;

Martin, 1989, 1993; Schwab, 1962).

A key characteristic of effective informal formative assessment is to promote frequent

assessment conversations that may allow teachers to listen to inquiry (Duschl, 2003). Listening to

inquiry should focus on helping students ‘‘examine how scientists have come to know what they

believe to be scientific knowledge and why they believe this knowledge over other competing

knowledge claims’’ (Duschl, 2003, p. 53). Therefore, informal formative assessment that

facilitates listening to inquiry should: (1) involve discussions in which students share their

thinking, beliefs, ideas, and products (eliciting); (2) allow teachers to acknowledge student

participation (recognizing); and (3) allow teachers to use students’ participation as the springboard

to develop questions and/or activities that can promote their learning (using).

62 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 7: 20163_ftp_002

Looking Closely at Assessment Conversations: The ESRU Cycle

We developed an approach that could help us recognize informal formative assessment

practices by establishing operational criteria for three sources of information: the informal

formative assessment cycle (ESRU); the inquiry dimensions previously described; and what we

and others consider to be relevant aspects of these dimensions (Duschl, 2003; Duschl & Gitomer,

1997; National Research Council, 1999, 2001).

Our approach, then, focuses on the nature of the teacher interventions (Table 2). Teacher

strategies are organized according to the ESRU cycle (eliciting, student response, recognizing,

Table 2

Strategies for ESRU cycles by dimension

Eliciting Recognizing Using

Epistemic frameworks

Teacher asks students to: Teacher: Teacher:� Compare/contrast observations,

data, or procedures� Clarifies/Elaborates based

on students’ responses� Promotes students’ thinking by

asking them to elaborate theirresponses (why, how)

� Use and apply known procedures � Takes votes to acknowledgedifferent students’ ideas

� Compares/contrasts students’responses to acknowledges anddiscuss alternative explanationsconceptions

� Make predictions/providehypotheses

� Repeats/paraphrasesstudents words

� Promotes debating anddiscussion among students’ideas/conceptions

� Interpret information, data, patterns � Revoices students’ words(incorporates students’contributions into the classconversation, summarizeswhat student said,acknowledge studentcontribution)

� Helps students to achieveconsensus

� Provide evidence and examples � Captures/displays students’responses/explanations

� Helps relate evidence toexplanations

� Relate evidence and explanations � Provides descriptive or helpfulfeedback

� Formulate scientific explanations � Promotes making sense� Evaluate quality of evidence � Promotes exploration of

students’ own ideas� Suggest hypothetical procedures or

experimental plans� Refers explicitly to the nature

of science� Compare/contrast others’ ideas � Makes connections to previous

learning� Check students’ comprehensionConceptual structuresTeacher asks students to:� Provide potential or actual

definitions� Apply, relate, compare, contrast

concepts� Compare/contrasts others’

definitions or ideas� Check their comprehension

INFORMAL ASSESSMENT PRACTICES 63

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 8: 20163_ftp_002

and using information) across two of the three domains, epistemic frameworks and conceptual

structures.4 We have not specified teacher interventions in the social domain because we believe

this domain is by nature embedded in assessment conversations. Duschl (2003) described the

social domain as ‘‘processes and forums that shape how knowledge is communicated, represented,

argued and debated’’ (p. 42). Thus, when teachers elicit student conceptions with a question of a

conceptual or epistemic nature, they are simultaneously engaging students in the social process of

scientific inquiry; that is, inviting students to communicate, represent, argue, and debate their

ideas about science. In this manner, questions categorized as either conceptual or epistemic can

also be demonstrated to have a social function. For example, a teacher may ask a student to

compare and contrast the concepts of weight and mass. While doing so serves the purpose of

helping students to differentiate between the two concepts, it may also inform students about the

appropriate scientific language for communicating ideas in class. In the epistemic domain, asking

students to provide evidence for their ideas serves the dual purposes of challenging students to

consider how they know what they know, as well as to establish standards for knowledge claims

communicated in science.

Rather than including students’ responses in Table 2, we chose to examine the teachers’

responses by referring to students’ responses and then using that information to determine the kind

of teacher response that had been delivered (e.g., repeats students’ words, revoices students’

words). It is important to note that the dimensions of scientific inquiry are used only to distinguish

the strategies used in the eliciting phase. The rationale is that the eliciting strategies (questions)

can be linked more clearly to these dimensions, whereas recognizing and using strategies (actions

taken by the teacher) can be used as a reaction to any type of initial question and student response.

The Appendix provides a complete list of the strategies used to code teachers’ questions and

actions.

The Study

The data analyzed in the present study were collected as part of a larger project conducted

during the 2003–2004 school year (Shavelson & Young, 2000). This project was a collaboration

between the Stanford Educational Assessment Laboratory (SEAL) and the Curriculum Research

and Development Group (CRDG) at the University of Hawaii. The project focused on the first

physical science unit of the Foundational Approaches to Science Teaching (FAST I; Pottenger &

Young, 1992) curriculum, titled ‘‘Properties of Matter.’’ The FAST curriculum is a constructivist,

inquiry-based, middle-school science education program developed by the CRDG and aligned

with the National Science Education Standards (Curriculum Research and Development Group,

2003; Rogg & Kahle, 1997).

In ‘‘Properties of Matter,’’ students investigate the concepts of mass, volume, and density,

culminating in a universal, density-based explanation of sinking and floating. Most of the

investigations in the unit follow a basic pattern of presenting a problem for students to investigate,

collecting data, preparing a graph, and responding to summary questions in a class discussion. The

curriculum places an emphasis on students working in small groups and participating in class

discussion to identify and clarify student understanding; thus, the FAST curriculum provides

multiple opportunities for assessment conversations to take place. The present analysis focuses on

the implementation of the first four investigations in this unit, Physical Investigation 1 (PS1) to

Physical Investigation 4 (PS4).

It is important to note that analysis of the teachers’ informal formative assessment practices

took place over time. That is, the information provided herein is not based on a single lesson but on

all the lessons that took place during the implementation of the four FAST investigations. By

analyzing each of these teachers over a period of time, we sought to characterize their informal

64 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 9: 20163_ftp_002

assessment practices in a manner not possible in studies that analyze conversation data from one

teacher (Kelly & Crawford, 1997; Roth, 1996), across different grade levels (van Zee, Iwasyk,

Kurose, Simpson, & Wild, 2001), or in excerpts of short duration (Resnick, Salmon, Zeitz,

Wathen, & Holowchask, 1993). In the next section, we describe in detail the instruments and

means used to collect the information.

Method

Participants

We selected three teachers and their students from the original 12 involved in the SEAL/

CRDG project on the basis of maximizing contrasts between informal assessment practices. These

teachers were also selected, because, at the time of the original analysis, videotape data and

measures of student performance on the pretest were also available. There were no significant

differences between the students’ performance in the teachers’ classrooms on the pretest.

The teachers participating in this study were asked to videotape their classrooms in every

session they taught beginning with their first FAST lesson. Each teacher was provided with a

digital video camera, a microphone, and videotapes. We provided each teacher with suggestions

on camera placement and trained them on how to use the camera. Teachers were asked to submit

the tapes to SEAL every week in stamped envelopes. The teachers’ characteristics are briefly

described in Table 3.

Rob. Rob teaches at a middle school in a rural community near a large city in the Midwest.

Students at the school are mostly white, and the highest percentage minority population in the

school is Native American (11%). Both Rob and a school administrator characterized the school as

being a ‘‘joiner,’’ taking opportunities to participate in nationwide assessments such as the

National Assessment of Educational Progress in the hope of receiving financial compensation.

Rob majored in science and recently completed a master’s degree in geosciences. He has taught

science for 14 years, working for 3 years at the sixth and seventh grade level. Although he has been

Table 3

Characteristics of teachers

Rob Danielle Alex

Gender Male Female MaleEthnicity White

(not Hispanic origin)White

(not Hispanic origin)White

(not Hispanic origin)Highest degree earned BA MA MAMajor in science Yes No YesMinor in science Yes No YesTeacher credential State in science,

diverse areasPreK-6 Residency certification

K-8 science andenglish

Years of teaching 14 3 2Years of teaching

science14 1 2

Years teachingsixth/seventh grade

3 1 2

Grade level taught 7 6 7Science sessions length 55 minutes 40 minutes 55 minutesNumber of students

taught25 25 26

INFORMAL ASSESSMENT PRACTICES 65

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 10: 20163_ftp_002

trained in multiple levels of FAST, the SEAL/CRDG project represented his first time teaching

FAST I. During the investigations, pairs of students would turn around to work with the pair of

students seated behind them as a group of four.

Danielle.Danielle teaches at an elementary school in the Northeast. While she is credentialed

to teach pre-kindergarten through sixth grade, she is now exclusively teaching sixth grade

mathematics, science, and reading as a result of being bumped from her position as a kindergarten

teacher during the year prior to the study. She majored in English as an undergraduate, and holds a

master’s degree in elementary education and reading. Her participation in the SEAL/CRDG

project came during her second year of teaching FAST. Danielle’s school is located in a

predominantly white, lower middle-class suburb in an expansive building that also houses the

district offices. Students sit at desks clustered into tables of four to six, and conduct their

investigations at these same tables. Danielle described her school district as being very supportive

of the FAST program, as well being able to provide teachers with opportunities to learn more about

students’ understanding. For example, Danielle took the eighth grade science test along with other

teachers from her district, discussing students’ responses in-depth along with possible ways to

identify the areas in which students need more assistance.

Alex.Alex was in his third year of teaching during the study, working at a middle school in the

Pacific Northwest. The student population is mostly white and middle class, but also with a

relatively high proportion of Native American students, from a nearby reservation. He majored in

natural resource management and holds a master’s degree in teaching. Alex described his school’s

administration as being supportive of his professional development, paying for him and other

teachers in the department to attend National Science Teachers’ Association conferences. During

the year of the SEAL/CRDG study, Alex taught six classes each day, five seventh grade FAST

classes and one elective class, which he added voluntarily to his course load, called ‘‘Exploring

Science.’’ His classroom is larger than Danielle’s or Rob’s, with space for desks aligned in rows in

the center of the classroom and laboratory stations, each with a large table and sink, around the

periphery of the classroom.

Coding the Videotape Data

All the videotapes for every session taught from PS1 to PS4 were transcribed, amounting

to transcripts from 30 lessons across the three teachers.5 The transcripts identified whether

statements were made by the teacher or by students. Each speaking turn made by the teacher was

numbered and segmented into verbal units (VUs). VUs were determined by the content of the

teacher’s statement.

Next, we identified the assessment conversations that occurred within each transcript. We

focused only on assessment conversations that involved teacher–whole-class interactions.

Assessment conversations were identified based on the following criteria (Duschl & Gitomer,

1997): the conversation concerned a concept from or aspect of one of the FAST investigations

(e.g., classroom procedures such as checking out books or upcoming school assemblies were not

considered); the teacher was not the only speaker during the episode (i.e., students also had turns

speaking); and student responses were elicited by the teacher through questions. Transcripts that

did not include assessment conversations were not coded further. Twenty-six assessment

conversations were identified across the instructional episodes and teachers. Of these

conversations, most (46%) were observed in Danielle’s classroom, followed by Alex’s (31%)

and Rob’s (23%).

We then coded the transcripts of the assessment conversations according to the ESRU model.

All coding of transcripts was performed while watching the videotapes. First, all the assessment

66 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 11: 20163_ftp_002

conversations identified across transcripts were coded at the speaking-turn level by two coders.

This coding process consisted of identifying the exact strategy being used by the teacher in a

particular speaking turn and/or verbal unit (e.g., teacher asks students to provide predictions,

teacher repeats student response, etc.).

Because each strategy we coded was mapped onto the ESRU model (see Table 2), we were

then able to identify instances in which the teacher–student conversation followed the ESRU

model. In this phase of the coding process, we identified complete (ESRU) and incomplete cycles

(ES and ESR) within the two types of inquiry dimensions, ‘‘Epistemic’’ and ‘‘Conceptual.’’ We

also coded ESRU cycles that involved ‘‘Non-Inquiry’’ types of codes (e.g., asking the students

simple questions). Therefore, nine types of ESRU cycles could be identified in the transcripts,

three of each inquiry dimension and three for the ‘‘Non-Inquiry’’ category.

In some cases, parts of the ESRU cycle were implicit in some of the teachers’ comments. For

example, a teacher’s question, ‘‘How do you know that?’’ implicitly recognizes the student’s

response, and uses that response to elicit more information from the student. Thus, we coded

statements such as this as implicit cases of recognizing and eliciting. In other cases, teachers would

elicit a particular question from the class, and then recognize individual students by name without

repeating the question. In these cases, we coded the speaking turn as an implicit case of eliciting

the previously asked question. Finally, coders noted when the same student was involved in the

teacher–student interactions by using arrows to connect the cycles.

Intercoder reliability. The identification of the ESRU cycles was done independently. Four

of the transcripts were used for training purposes and five were used to assess the consistency of

coders in identifying the cycles. Because the main interest was in determining the consistency of

coders in identifying types of ESRUs, we calculated the intercoder reliability by type. The

averaged intercoder reliability coefficient across the nine types of ESRUs was 0.74. The lowest

reliability was found in the ‘‘Non-Inquiry’’ ESRU category. When only the ‘‘Epistemic’’ and

‘‘Conceptual’’ ESRUs were considered, the averaged intercoder reliability was 0.81. We

concluded that coders were consistent in identifying the two important types of ESRUs. Therefore,

the remaining transcripts (10), omitting those that had poor sound, no sound, or did not include

discussions (11), were coded independently.6

Measures of Student Learning

To assess student learning we used two sources of information: a multiple-choice

achievement test administered as a pretest, and the information collected in an embedded

assessment administered after Investigation 4 (PS4). The 38-item, multiple-choice test included

questions on density, mass, volume, and relative density.

The embedded assessment analyzed for this study involved three types of assessments:

Graphing (3 items); Predict–Observe–Explain (POE; 12 items); and Prediction Question

(2 items). These assessments were designed to be implemented in three sessions, each taking about

40 minutes to complete.7 In theGraphing prompt, students were provided with a representation of

data that is familiar to them: a scatterplot that shows the relationship between two variables, mass

and depth of sinking. This prompt requires students to: (a) judge the quality of the graph

(completeness and accuracy); (b) interpret a graph that focuses on the variables involved in sinking

and floating; and (c) self-evaluate how well they judged the quality of the graph presented. In the

POE prompt students were first asked to predict the outcome of some event related to sinking and

floating and then justify their prediction. Then students were asked to observe the teacher carrying

out the activity and describe the event that they saw. Finally, students were asked to reconcile and

explain any conflict between prediction and observation. The Predict–Observe (PO) prompt is a

INFORMAL ASSESSMENT PRACTICES 67

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 12: 20163_ftp_002

slight variation on the POE. This prompt asked students only to predict and explain their

predictions (P) and observe (O) an event; they were not asked to reconcile their predictions with

what was observed.POs act as a transition to the next instructional activity of the unit in which the

third piece of the prompt, the explanation, will emerge.POs can be thought of as a springboard that

motivates students to start thinking about the next investigation.

Reliability of measures of student learning. The internal consistency of the multiple-choice

pretest was 0.86 (Yin, 2004). To assess the interrater reliability of the embedded assessments we

randomly selected 20 students’ responses of each prompt scored independently by two scorers.

The magnitude of the interrater reliability coefficients varied according to the question at hand

(Graphing¼ 0.97,POE¼ 0.98,PO¼ 0.94). We concluded that scorers were consistent in scoring

and coding students’ responses to the different questions involved in the embedded assessment

after PS4. Based on this information, the remainder of the students’ responses (56) were only

scored by one of two scorers. For the 20 students scored for assessing the interscorer reliability, the

averaged score across the two raters was used for the analyses conducted.

Results

This study responds to two research questions: (1) Can the model of informal formative

assessment distinguish the quality of informal assessment practices across teachers? (2) Can the

quality of teachers’ informal formative assessment practices be linked to student performance? To

respond to the first question, we provide evidence about teachers’ informal assessment practices.

To respond to the second question, we provide information about students’ observed performance

on the pretest and embedded assessments, and link students’ performance to the informal

assessment practices observed across teachers.

Distinguishing Informal Assessment Practices Across Teachers

Table 4 presents the frequency of each type of ‘‘cycle’’ observed in all the assessment

conversations analyzed. We identified both the complete and incomplete informal formative

assessment (ESRUs) cycles in the epistemic and conceptual dimensions. Note that the incomplete

Table 4

Frequency of informal formative assessment cycles by inquiry dimension and teachera

Teacher

Epistemic Conceptual Non-Inquiry

ES ESR ESRU ES ESR ESRU ES ESR ESRU

Rob 1 77 1 1 5 0 3 9 02 (2)

Danielle 10 119 85 1 21 11 4 13 102 (2) 20 (2) 1 (2) 2 (2) 3 (2)1 (4) 10 (3) 1 (4)

1 (4)Alex 11 85 27 9 20 10 7 15 2

2 (2) 5 (2) 2 (2) 1 (2) 1 (2)1 (4) 1 (4) 1 (3)

aNumbers in the second row for each teacher represent the number of consecutive cycles with the same student. Number in

parentheses represent the number of iterations with the same student, and number outside the parentheses represent the

number of times observed.

68 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 13: 20163_ftp_002

cycle, ESR—teacher elicits, student responds, and teacher recognizes—was the highest type of

‘‘cycle’’ observed for Rob and Alex.

Rob had the smallest number of complete and incomplete cycles. Only one complete cycle

was observed from the discussions we analyzed from his videotapes. Although Danielle and Alex

showed the highest frequency of incomplete cycles (ESRs) across the two scientific inquiry

dimensions, there is an important difference in the patterns of these cycles, which is discussed in

more detail in what follows. More complete cycles were observed for Alex than for Rob, but much

less than those observed for Danielle.

Another important aspect of the cycles captured during the coding was the consecutive

interactions that the teachers had with the same students (we named them iterations). They are

represented in the table as the number in parentheses. The number outside the parentheses

indicates the number of times that double, triple, or quadruple iterations were observed. We

interpreted these iterations as an indicator of spiraling assessment conversations in which the

teacher was looking for more evidence of the students’ level of understanding that could

provide information for helping the student to move forward in her/his learning. We observed

more iterations with the same student in Danielle’s lessons than in those of Rob or Alex. Most

involved two iterations, although three and four iterations were also observed.

It is important to note that most of the cycles were classified as focusing on the ‘‘Epistemic’’

dimension and very few on ‘‘Conceptual’’dimension, even though there was opportunity for doing

so, especially in Investigation 4 (PS4). Fifty-two percent of the cycles across the three teachers

were classified as ‘‘Epistemic’’ and only 7% as ‘‘Conceptual.’’ The rest of the cycles were

classified as ‘‘Non-Inquiry.’’ To explore this finding further, Table 5 presents information

regarding the ‘‘eliciting’’ cycle component by category within each dimension. The highest

percentage (25%) of the eliciting questions across the three teachers focused on Asking students

for the application of known procedures, observations, and predictions. Notice that very few

assessment conversations involved Formulation of explanations, Evaluation of quality of the

evidence (to support explanations), or Comparing or contrasting others’ ideas, explanations. It

is possible that by omitting these important steps in favor of focusing on predictions and

Table 5

Percentage of eliciting questions by category within dimension

Dimension Eliciting Strategy Rob Danielle Alex

Epistemic Compare and contrast observations, data, or procedures 8 1 12Use and apply known procedures 15 29 23Provide observations 33 15 15Make predictions or provide hypotheses 23 9 13Formulate scientific explanations 12 4 7Provide evidence and example 10 3 9Interpret information, data, patterns 0 18 2Relate evidence and explanations 0 4 1Evaluate the quality of evidence 0 1 0Compare or contrast others’ ideas, explanations 0 1 0Suggest hypothetical procedures or experimental plans 0 2 12

Conceptual Questions to check students’ comprehension 0 14 7Provide potential or actual definitions 0 59 41Apply, relate, compare, contrast concepts 29 21 26Compare or contrast others’ definitions 0 0 0Questions to check students’ comprehension 71 21 33

INFORMAL ASSESSMENT PRACTICES 69

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 14: 20163_ftp_002

observations may provide students with an incomplete experience in inquiry learning. This

information is important because it reflects that these teachers paid attention to the procedural

aspects of the epistemic frameworks more than to the development of the criteria to make

judgments about the products of inquiry (e.g., explanations).

The highest percentage of the eliciting conceptual questions focused on asking students to

Define concepts (44%) andChecking for students’ understanding (32%). Some questions focused

on Applying concepts and very few on Comparing, contrasting, or comparing concepts. The

highest percentage in the ‘‘Non-Inquiry’’ category, not shown in the table, focused on asking

students simple and yes/no questions. Alex had the highest percentage on this type of question

(52%), followed by Rob (47%) and Danielle (37%).

Across the three teachers, Repeating and Revoicing were the two recognizing strategies used

more frequently. Although repeating is a common way of acknowledging that students have

contributed to the classroom conversation, revoicing has been considered a better strategy for

engaging students in the classroom conversations (O’Connor & Michaels, 1993). The highest

percentage for use of this code was found for Danielle (31%), followed by Alex (20%), then Rob

(17%). When teachers practice revoicing, ‘‘they work students’ answers into the fabric of an

unfolding exchange, and as these answers modify the topic or affect the course of discussion in

some way, these teachers certify these contributions and modifications’’ (Nystrand & Gamoran,

1991, p. 272). Revoicing, then, is not only a recognition of what the student is saying, but also

constitutes, in a way, an evaluation strategy because the teacher acknowledges and builds on the

substance of what the student says. Furthermore, it has been found that this type of engagement in

the classroom has positive effects on achievement (Nystrand & Gamoran, 1991).

Another strategy that has been considered essential in engaging students in assessment

conversations is Comparing and contrasting students’ responses to acknowledge and discuss

alternative explanations or conceptions (Duschl, 2003). This strategy is essential in examining

students’ beliefs and decisionmaking concerning the transformation of data to evidence, evidence

to patterns, and patterns to explanations. We found that comparing and contrasting students’

responses was not a strategy frequently used by these teachers. The highest percentage was

observed for Danielle (10%), followed by Alex (7%). Across all the assessment conversations,

Rob used this strategy only once. The critical issue in this strategy is whether teachers point out

the critical differences in students’ reasoning, explanations, or points of view. Furthermore, for

the ESRU cycle to be completed, the next necessary step would be helping students to achieve a

consensus based on scientific reasoning. This can be done by promoting discussion and

argumentation. Clearly, Promoting argumentation was not a strategy used by these teachers. It

seems, then, that the third and most important step to complete the cycle was missed. The

opportunity to move students forward in their understanding of conceptual structures and/or

epistemic frameworks was lost. Helpful feedback, or comments that highlight what has been done

well and what needs further work, was also a strategy rarely used. Danielle was the teacher who

provided the most helpful feedback (14%), followed by Alex (2%).

Characterizing Informal Assessment Practices Across Teachers

Rob. During Rob’s enactment of the four investigations, he exhibited only a few instances of

complete ESRU cycles. More commonly, Rob’s teaching involved broken cycles of ESR; that is,

without the key step of using, basically the same as the IRE/F sequence. Although most of the

broken cycles took place in the ‘‘Epistemic’’ domain, the majority of his questions involved

discussions of procedures, such as how to create a graph. An example of Rob’s conversational style

is illustrated in the excerpt contained in Table 6. During this part of the lesson, Rob provides some

70 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 15: 20163_ftp_002

Tab

le6

Transcriptexcerptfrom

Rob’sclass

Sp

eak

erD

ialo

gu

eC

ycl

eC

od

e

Ro

bW

ell,

we’

reg

oin

gto

hav

eto

com

eu

pw

ith

som

eso

rto

fa

scal

e,so

me

way

that

we’

reg

oin

gto

be

able

tou

seo

ur

Xan

dY

,h

ori

zon

tal

and

ver

tica

lax

es,an

dso

wh

enw

ed

oth

at,

let’

str

yto

fig

ure

ou

t,h

ow

man

yta

llis

this

?T

hre

eb

oxes

tall

.F

ocu

sth

at.

It’s

blu

rry

tom

e.Is

itb

lurr

yto

yo

u?

How

man

yta

ll?

I’ve

on

lyco

un

ted

20

bo

xes

hig

h,an

dw

hat

’so

ur

max

imu

mn

um

ber

over

her

eo

nth

esh

eet?

Itlo

ok

sli

ke

39

,so

we

cou

ldp

rob

ably

go

by

two

s,o

kay

,w

e’ll

star

to

ut

wit

htw

os.

Let

’sg

ow

ith

zero

her

e,an

dev

ery

thin

g,

ever

yfu

llli

ne

isg

oin

gto

be

two

.T

wo

,4

,6

,8

,1

0,

12

,1

4,

16

,1

8—

yo

ug

uy

sca

nco

un

t,I

do

n’t

hav

eto

cou

ntfo

ry

ou

.Ica

nal

mo

stg

etth

ew

ho

leth

ing

on

ther

efo

ry

ou

tolo

ok

at.O

kay

.Now

,wh

atd

idy

ou

say,

3h

ere?

Ifw

elo

ok

bac

kover

toP

S3

[in

aud

ible

].It

do

esn

’tte

llu

s,so

we

can

pu

tP

S3

on

[in

aud

ible

],th

at’s

fin

e.Y

ou

’ll

kn

ow

wh

atw

e’re

talk

ing

abo

ut.

So

,P

S3

and

then

pu

tin

div

idu

alo

rm

ine

or

som

eth

ing

lik

eth

atth

atw

ill

tell

yo

uth

atth

isis

yo

ur

gro

up

’s,

and

then

we’

llg

oto

the

nex

to

ne

and

itw

ill

be

PS

3an

dth

enw

e’ll

pu

tth

ecl

ass;

that

way

,y

ou

’ll

kn

ow

on

e’s

clas

sd

ata

and

on

e’s

yo

urs

.S

tud

ent

[In

aud

ible

.]R

ob

Ok

ay.

Now

,g

oin

gac

ross

the

bo

tto

mh

ere,

let’

sm

ove

ali

ttle

bit

qu

ick

ero

nth

is,

go

ing

acro

ssth

eb

ott

om

,o

ur

nu

mb

ers

are

go

ing

tob

e,w

hat

,w

e’ve

go

t4

cen

tim

eter

s,6

,5

,6

,7

,8

,so

we’

ve

go

t1

0,

soif

we

cou

ldg

ob

yfo

ur—

how

man

y[i

nau

dib

le]?

ET

each

eras

ks

stu

den

tsto

use

and

app

lyk

now

np

roce

du

res

Stu

den

tS

ixte

en.

SR

ob

Six

teen

acro

ssth

eb

ott

om

.Let

’ssa

yea

cho

ne

cou

ldb

e,o

rea

chtw

oco

uld

be

on

e,bu

tth

at’s

pro

bab

lyg

oin

gto

pu

shth

eis

sue.

[In

aud

ible

.]O

kay

,so

let’

sg

ow

ith

each

bo

xis

wo

rth

on

e,an

dth

atw

ayw

efi

tev

ery

thin

g,

we

may

get

ali

ttle

cram

ped

bu

tw

e’ll

fit

ever

yth

ing

on

the

pag

e.

RT

each

erre

vo

ices

stu

den

t’s

wo

rds

AN

D

Tea

cher

elab

ora

tes

bas

edo

nst

ud

ent’

sre

spo

nse

Stu

den

t[I

nau

dib

le.]

Ro

bI

just

sto

pp

edat

nin

eb

ecau

seth

at’s

all

we

hav

eu

ph

ere.

Now

,w

e’ve

bee

np

reac

hin

gab

ou

ty

ou

’ve

go

tto

hav

eu

nit

san

dvar

iab

les,

sow

hat

wer

ew

em

easu

rin

gd

ow

nh

ere,

len

gth

?E

Tea

cher

ask

sst

ud

ents

tou

sean

dap

ply

kn

ow

np

roce

-d

ure

sS

tud

ent

Of

the

stra

w.

SR

ob

An

dh

ow

did

we

mea

sure

it?

ET

each

eras

ks

stu

den

tsto

use

and

app

lyk

now

np

roce

-d

ure

sS

tud

ent

Cen

tim

eter

s.S

Ro

bC

enti

met

ers,

ok

ay?

RT

each

erre

pea

tsst

ud

ent’

sw

ord

s

INFORMAL ASSESSMENT PRACTICES 71

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 16: 20163_ftp_002

guidance to the students on the task they will complete and models how to graph on an overhead.

Notice how Rob repeats students’ responses but does not ask students to elaborate their responses.

In addition, Rob’s speaking turns are quite long in comparison to the shorter responses provided by

his students.

In the excerpt, Rob provides the students with direction on how to make a graph as he models

it for the class. Although he asks questions in the course of his initial speaking turn, he does not

wait for students to respond (repaired questions). Although students do respond to some of his

questions later, the conversation elicits a minimal amount of information, and the students’

responses are simply repeated by the teacher. Given these trends in Rob’s teaching, it appears that

most of his discussions are not assessment conversations at all, but seem to be more general

whole-class discussions in which the teacher guides classroom talk and students participate

minimally.

Danielle. In contrast to Rob’s teaching style, Danielle’s classroom discussions are

characterized by a larger number of complete ESRU cycles. Although the majority of these

cycles were also in the ‘‘Epistemic’’ domain, as with Rob, Danielle’s teaching differs in that she

asks multiple questions of the same student, often going through several iterations of the ESRU

cycle with the same student. Furthermore, she asks students to interpret patterns in their graphs in

addition to speaking about the procedure of constructing them. The differences between Danielle

and Rob’s teaching can be illustrated in the excerpt in Table 7, in which Danielle discusses

construction of the same graph as Rob worked with in the previous example. As with Rob, during

this ‘‘conversation’’ Danielle provides some guidance to the students on the task and she also

models on an overhead how to graph. However, Danielle asks questions that will provide her with

information about the students’ understanding about constructing a graph. After repeating a

student’s response, she asks students to elaborate their responses. Also, she positively reinforces a

student for using ‘‘a very scientific word.’’

In contrast to Rob, Danielle starts her conversation in this excerpt with student input, and

builds upon it by repeating students’ words (accepting them into the ongoing conversation), asking

students to elaborate, and clarifying and elaborating upon student comments. This pattern

occurred many times in our observations of Danielle’s teaching. This example thus fits the model

of an assessment conversation, in that Danielle’s questions are more instructionally responsive

than would be expected from an everyday lesson. That is, Danielle asks questions similar to

those Rob asks his students, and also frequently repeats what they have said; however, Danielle

pushes her students for more elaboration on their responses, and also provides feedback

that allows students to learn more about classroom expectations for learning. Danielle’s lessons

also showed characteristics of assessment conversations not found in Rob’s or Alex’s lessons. For

example, she was the only one comparing and contrasting students’ responses. Rather than

moving from student to student without identifying how students’ ideas were different, Danielle

highlights these differences in the excerpt in Table 8, taken from a discussion about an unknown

liquid.

Alex. Like Rob, Alex’s teaching included more broken than complete ESRU cycles,

although Alex more frequently completed cycles than Rob. Like the other two teachers, Alex

modeled graphing on an overhead and checked students’ understanding. However, like Rob, the

conversation has more of a tone of delivery of guidance on how to make a graph rather than relying

upon student input. Also, like Danielle and Rob, Alex asked mostly procedural questions, but the

responses he sought were not provided by the students, so ultimately Alex provided the responses

himself. The excerpt in Table 9, like previous ones, is taken primarily from the ‘‘Epistemic’’

domain. Rather than illustrating iterative ESRU cycles, this excerpt shows multiple broken cycles,

including one that ends with an evaluative comment.

72 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 17: 20163_ftp_002

Although Alex does show concern for the level of his students’ understanding, reflecting the

fact that he has gained information from their responses, he does not use the responses in a manner

consistent with the ESRU model; rather, he provides answers himself, and asks students if what he

has said makes sense to them. In this manner, the model of assessment conversations also does not

characterize well the patterns of interactions in Alex’s classroom.

Summary. Given the divergent pictures of informal formative assessment just

constructed, examining assessment conversations through the lens of the ESRU model has

allowed us to capture differences between teachers’ informal formative assessment practices.

Based on the evidence collected, we conclude that Danielle conducted more assessment

conversations than Rob and Alex and that the characteristics of the assessment conversations were

more consistent with our conception of informal assessment practices in the context of

scientific inquiry. We also conclude that Alex’s informal formative assessment strategies

were somehow more aligned with our model than those of Rob. In the next section we provide

evidence on students’ performance and link the performance with teachers’ informal assessment

practices.

Table 7

Transcript excerpt from Danielle’s classa

Speaker Dialogue Cycle Code

Danielle . . .So the first three things you want to do,very important things, you want to labelyour vertical axis, you want to labelyour horizontal axis, and then you wantto give the whole graph a title. Andwe’ve done that. So, taking a look atthis, am I ready to go? Can I startplotting my points?

E Teacher asks students to use andapply known procedures

Student No. SDanielle Why not? (R) U Teacher asks student to

elaborate on his responseStudent You didn’t. . . SDanielle What do we need to do? Eric? E Teacher asks students to use and

apply known proceduresStudent Put the scale. SDanielle The scales. What do you mean by scales? R Teacher repeats student’s words

U (E) Teacher checks students’understanding of knownprocedures

Student The numbers. SDanielle ‘‘The numbers.’’ Good. Excellent. I liked

that you used the word ‘‘scales,’’ it’s avery scientific word. So, yes, we need tofigure out what the scales are, what weshould number the different axes.

R Teacher repeats student’s words

U Teacher provides descriptive orhelpful feedback

aParentheses indicate that the cycle component is ‘‘implicit.’’

INFORMAL ASSESSMENT PRACTICES 73

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 18: 20163_ftp_002

Tab

le8

Transcriptexcerptfrom

Danielle’sclass

withcomparingandcontrastingstudents’responsesandpromotingdiscussionamongstudents

Sp

eak

erD

ialo

gu

eC

ycl

eC

od

e

Dan

iell

eS

o,

that

’sw

hen

we’

reta

lkin

gab

ou

tli

qu

id2

;w

hat

abo

ut

that

bo

tto

mli

qu

id,

do

esan

yo

ne

hav

ean

yth

eori

eso

nth

atb

ott

om

liq

uid

?I

kn

ow

afe

wta

ble

sm

enti

on

edit

.A

man

da?

ET

each

eras

ks

stu

den

tsto

pro

vid

eh

yp

oth

eses

Stu

den

tIt

’sw

ater

.S

Dan

iell

eW

hy

do

yo

uth

ink

it’s

wat

er.S

he’

sco

nfi

den

t.S

he

did

n’t

even

say,

wel

l,I

thin

k,s

he’

sli

ke,

it’s

wat

er.

Wh

yd

oy

ou

thin

kit

’sw

ater

.R

Tea

cher

rep

eats

stu

den

t’s

wo

rds

U(E

)T

each

eras

ks

stu

den

tto

elab

ora

tep

rev

iou

sre

spo

nse

Stu

den

tB

ecau

seit

’sw

ater

and

oil

[in

aud

ible

]o

nth

ew

ater

.S

Dan

iell

eO

kay

.S

osh

e’s

say

ing

yo

up

ut

wat

eran

do

ilto

get

her

,th

eo

ilis

on

the

top

.S

osh

e’s

say

ing

that

wat

er’s

on

the

bo

tto

m.

An

oth

erta

ble

had

said

som

eth

ing

dif

fere

nt.

Wh

atd

idy

ou

gu

ys

thin

k?

RT

each

erre

vo

ices

stu

den

t’s

wo

rds

UT

each

erco

mp

ares

/co

ntr

asts

stu

den

ts’

resp

on

ses

ET

each

erp

rom

ote

sd

iscu

ssio

n/

arg

um

enta

tio

nam

on

gst

ud

ents

Stu

den

tT

he

oil

’so

nth

eb

ott

om

and

the

wat

er’s

on

the

top

.S

Dan

iell

eT

hey

tho

ug

ht

the

op

po

site

.T

hey

did

thin

kit

was

oil

and

wat

erbu

tth

eyth

ou

gh

to

ilw

aso

nth

eb

ott

om

.B

illy

?(R

)U

Tea

cher

com

par

es/c

on

tras

tsst

ud

ents

’re

spo

nse

s(E

)T

each

era

pro

mo

tes

dis

cuss

ion

/ar

gu

men

tati

on

amo

ng

stu

den

tsS

tud

ent

We

tho

ug

ht,

we

kn

ewth

eb

ott

om

cou

ldn

’th

ave

bee

nw

ater

bec

ause

the

bo

tto

m,

wat

erw

ou

ldfl

oat

on

top

of

oil

bec

ause...

S

Dan

iell

eH

ow

do

yo

uk

now

that

?H

ow

do

yo

uk

now

that

?(R

)U

(E)

Tea

cher

ask

sst

ud

ent

toel

abo

rate

pre

vio

us

resp

on

seS

tud

ent

I’ve

do

ne

exp

erim

ents

ath

om

ew

ith

oil

and

wat

er.

SD

anie

lle

Ok

ay.N

ow

,A

man

da

thin

ks

she

did

anex

per

imen

tin

ho

me

too

,an

dit

was

the

op

po

site

,th

eo

ilw

aso

nto

po

fth

ew

ater

,so

we

hav

etw

od

iffe

ren

tth

ing

sh

ere,

two

dif

fere

nt

peo

ple

did

the

sam

ek

ind

of

exp

erim

ent

ath

om

ew

ith

two

dif

fere

nt

resu

lts.

So

wh

atd

ow

eb

elie

ve?

(R)

UT

each

erco

mp

ares

/co

ntr

asts

stu

den

ts’

resp

on

ses

74 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 19: 20163_ftp_002

Tab

le9

Transcriptexcerptfrom

Alex’sclass

Sp

eak

erD

ialo

gu

eC

ycl

eC

od

e

Ale

x...J

ane,

wh

atd

ow

eca

llth

atw

hen

we

nu

mb

erth

eX

and

Yax

is?

ET

each

erch

eck

sst

ud

ents

’co

mp

reh

ensi

on

Stu

den

tL

abel

ing

?S

Ale

xR

igh

t,bu

tw

hat

isit

afte

rw

ela

bel

the

Xan

dY

axis

?W

hat

do

esth

atb

eco

me?

RT

each

erp

rov

ides

eval

uat

ive

resp

on

seE

Tea

cher

chec

ks

stu

den

ts’

com

pre

hen

sio

nS

tud

ent

Th

e[i

nau

dib

le].

SA

lex

Wh

enI

say

it,y

ou

gu

ys

wil

lre

mem

ber

it.It

’sca

lled

the

scal

e.R

igh

t,w

eh

ave

top

ut

asc

ale,

ou

rsc

ale,

itte

lls

us

wh

atit

isw

e’re

mea

suri

ng

,ri

gh

t?A

ctu

ally

,it

tell

s,th

eti

tle

tell

su

sw

hat

itis

we’

rem

easu

rin

g,b

utth

esc

ale

giv

esu

sin

crem

ents

tou

seto

mea

sure

itw

ith

.S

o,

loo

kin

go

no

ur

dat

a,w

her

eis

my

dat

ash

eet,

loo

kin

go

no

ur

dat

au

ph

ere,

how

far

do

we

hav

eto

go

,how

big

of

an

um

ber

do

we

[in

aud

ible

]?S

om

eon

ew

ho

has

n’t

talk

edy

etto

day

.If

yo

ulo

ok

on

her

e,w

ew

ant

tok

now

the

nu

mb

ero

fb

b’s

;ri

gh

t?W

en

eed

tob

eab

leto

fit

on

ou

rg

rap

hal

lth

eb

b’s

that

wer

eco

un

ted

off

inth

isd

ata

tab

le.

Wh

at’s

the

big

ges

tn

um

ber

that

we

hav

e?S

om

eon

ew

ho

has

n’t

talk

edy

et.

Jess

ica.

RT

each

erel

abo

rate

sb

ased

on

stu

den

t’s

resp

on

se

ET

each

eras

ks

stu

den

tsto

use

and

app

lyk

now

np

roce

du

res

Stu

den

tF

ort

y.S

Ale

xF

ort

y.Is

40

yo

ur

big

ges

to

ne?

Over

her

e?O

h,y

ou

kn

ow

wh

at,t

hat

iso

ur

big

ges

to

ne.

We

can

use

thes

en

um

ber

s.W

ed

on

’th

ave

all

of

them

bu

tw

eca

ng

oah

ead

and

use

them

.S

o,

40

.Y

ou

wan

tto

mak

esu

reth

atw

eca

ng

et4

0o

nth

eg

rap

h.

So

,y

ou

kn

ow

wh

at,

gu

ys,

Ita

ke

that

bac

k,

Ita

ke

that

bac

k.

Io

nly

wan

tto

gra

ph

this

dat

ato

the

left

of

that

lin

e;d

oes

that

mak

ese

nse

?D

oy

ou

gu

ys

wan

tto

gra

ph

thes

en

um

ber

sover

her

e?

RT

each

erre

pea

tsst

ud

ent’

sw

ord

s

ET

each

eras

ks

stu

den

tsi

mp

leq

ues

tio

nS

tuden

tsN

o.

SA

lex

Th

eo

nly

reas

on

Isa

yth

atis

bec

ause

I’m

loo

kin

gat

my

gra

ph

,an

dw

e’re

go

ing

toh

ave

ou

rd

ata

her

eal

lk

ind

of

clu

mp

edin

on

esp

ot,

and

then

thes

en

um

ber

sar

eg

oin

gto

be

hu

ge

bec

ause

we

[in

aud

ible

]ce

nti

met

ers.

So

,I

thin

kth

is,fo

rth

isg

rap

h,it

mig

ht

loo

k,it

mig

ht

mak

em

ore

sen

seto

[in

aud

ible

]th

isis

ou

rfi

rstg

rap

h,I

do

n’t

wan

tto

con

fuse

yo

u,b

ecau

seit

’sli

ne

of

bes

tfi

tI

fin

do

ften

con

fuse

sp

eop

le.S

oI’

mg

oin

gto

mak

eth

isfi

rst

on

ea

litt

lele

ssco

nfu

sin

g.S

o,I

on

lyw

ant

tog

rap

hth

isp

art

of

the

[in

aud

ible

].O

kay

?S

o,if

yo

u’r

ein

gro

up

1,y

ou

’re

go

ing

tod

oth

atfi

rst,

gro

up

1fi

rst.

Bu

t,an

yw

ay,y

ou

hav

eth

at[i

nau

dib

le].

Do

esth

atm

ake

sen

se?

Did

that

con

fuse

any

bo

dy

?N

o?

Ok

ay.

INFORMAL ASSESSMENT PRACTICES 75

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 20: 20163_ftp_002

Linking Teachers’ Informal Assessment Practices and Students’ Performance

To link the teachers’ informal assessment practices with student performance we first provide

evidence on the initial status of the teachers’students before the instruction of the 12 investigations

began. We then provide information about the students’ performance at the embedded

assessments after PS4.

Students’ observed performance in the pretest. The 38-item multiple-choice test was used to

assess whether students across the three groups stood similarly on their knowledge about relative

density before Investigation 1 (PS1). To take into consideration the nested nature of the data

collected we first tested differences between the three groups with a one-way analysis of variance

(ANOVA), and calculated an intraclass correlation coefficient to explore independence of

the observations. The one-way ANOVA indicated no significant differences between groups

[F(2, 77)¼ 0.74, p¼ 0.48] and the magnitude of the intraclass correlation coefficient (p¼ 0.009)

indicates that the proportion of the variance of the student performance due to teacher at the

beginning of the instruction was extremely low. We conclude that, in general, students within each

of the groups studied were similar at the beginning of the study in their understanding of relative

density.

Students’ observed performance at the embedded assessments. Mean scores and standard

deviations across assessments are provided in Table 10. Students’ performance varied across

questions and teachers. The highest students’ performance across the three items was observed for

Danielle, followed by Alex and Rob. Only in Question 3, Predict–Observe, was Alex’s students’

performance was lower than that of Rob’s students.

We carried out a general linear model approach with teacher fixed effects to analyze the scores

across the three embedded assessments. The model shows significant differences on the

average scores between teachers for Graphing [F(2, 69)¼ 5.564, p¼ 0.006, R2¼ 0.139], POE

[F(2, 70)¼ 28.939, p¼ 0.000, R2¼ 0.453], and PO [F(2, 51)¼ 5.257, p¼ 0.008, R2¼ 0.171].

Therefore, we conclude that teacher appears to be a predictor for assessment scores.

Unfortunately, not all of Rob’s students’ responses were available for Assessment 2, POE. Only

the Observation–Explanation part could be scored for this teacher. Therefore, we carried out a

second analysis between Danielle’s and Alex’s students only. Results were replicated, but with a

lower R2 [F(1, 46)¼ 10.08; p¼ 0.003; R2¼ 0.180].

Linking teacher practices to student performance. The results described indicate that

Danielle’s students were those whose observed performance was higher across all the questions; in

fact, her students performed significantly higher in the embedded assessments than those of the

other two teachers. In addition, compared with Rob and Alex, Danielle’s strategies were most

consistent with the ESRU model. It is important to remember that, at the beginning of the school

Table 10

Mean scores and standard deviation across the embedded assessments

Assessment Maximum

Rob Danielle Alex

n Mean SD n Mean SD n Mean SD

1. Graphing 14 23 6.17 2.87 23 8.76 2.73 26 6.44 3.092. Predict–

Observe–Explain20 25 6.32a 2.89a 25 14.84 4.49 22 10.78 4.34

3. Predict–Observe 2 22 1.27 0.98 22 1.54 1.01 10 0.40 0.52

aStudents’ responses for the Predict part were not available.

76 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 21: 20163_ftp_002

year, there was no significant difference in student performance on the pretest among the students

of the three teachers and that teacher was the source of variability between groups. However, at the

end of the fourth investigation, teacher was a good predictor. This information leads us to conclude

that better informal assessment practices could be linked to better student performance, at least in

the sample we used for this study.

Conclusions

In this investigation we studied an approach for examining informal assessment practices

based on four components of assessment conversations (the teacher Elicits a question, the Student

responds, the teacher Recognizes the student’s response, and then Uses the information collected

to support student learning) and three domains linked to science inquiry (epistemic frameworks,

conceptual structures, and social processes). We have proposed assessment conversations (ESRU

cycles) as a step beyond the IRE/F pattern and how they may lead to increases in student learning.

Our findings regarding broken cycles (ESRs) are consistent with those of the IRE/F studies, in

that teachers conduct generally one-sided discussions in which students provide short answers that

are then evaluated or provided with generic feedback by the teacher (Lemke, 1990). However, we

believe that our ESRU model is a more useful way of distinguishing those teacher–student

interactions that go beyond the generic description of ‘‘feedback’’; it is the final step of using

information about students’ learning that distinguishes ESRU cycles from IRE/F sequences.

Using implies more than providing evaluation or generic forms of feedback to learners, but

rather involves helping students move toward learning goals. In the ESRU cycle a teacher can

ask another question that challenges or redirects the students’ thinking, promotes the exploration

and contrast of students’ ideas, or makes connections between new ideas and familiar ones.

Teachers also can provide students with specific information on actions they may take to reach

learning goals. In this manner, viewing whole-class discussions in an inquiry context through

the model of ESRU cycles allows the reflective and adaptive nature of inquiry teaching to be

highlighted.

We asked two questions in relation to the ESRU cycle: Can the ESRU cycle allow

distinguishing the quality of informal assessment practices across teachers? Can the quality of

teachers’ informal assessment practices be linked to the students’ performance? To respond to

these questions we studied the informal formative assessment practices in different science

classrooms. Using the model of ESRU cycles we tracked three teachers’ informal formative

assessment practices over the course of the implementation of four consecutive FAST

investigations. Furthermore, to track the impact of informal assessment practices and their

impact on student learning, we collected information on the students’ performance using two

sources of information: a pretest multiple-choice test and three less conventional types of

assessments administered as embedded assessment, a graphing prompt, a predict–observe–

explain prompt, and a predict–observe prompt.

We have provided evidence that the approach developed could capture differences in

assessment practices across the three teachers studied. Based on our small sample size, it seems

that the teacher whose whole-class conversations were more consistent with the ESRU cycle had

students with higher performance on embedded assessments. We have also provided evidence

about the technical qualities of the approach used to capture ESRU cycles. Intercoder reliability

was 0.81 for cycles involved in the ‘‘Epistemic’’ and ‘‘Conceptual’’ dimensions. A piece of

information related to the validity of the coding system was also provided by proving that the

approach could capture differences in informal assessment practices. We found that the model

provided important information about the teachers’ informal assessment practices.

INFORMAL ASSESSMENT PRACTICES 77

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 22: 20163_ftp_002

Our findings indicate that, in the context of scientific inquiry, teachers seem to focus more on

procedures rather than on the process of knowledge generation and thus students may not be

getting a full experience with scientific inquiry. Based on the information collected, we believe

that Duschl’s (2003) epistemic dimension seems to include both the procedures necessary to

generate scientific evidence and the reasoning processes involved with generation of scientific

knowledge. Classifying these two elements of the scientific process together masks our finding

that teachers focused on the procedures involved in scientific inquiry rather than the process of

developing scientific explanation. Making this distinction can provide clearer evidence about how

students are provided with opportunities to determine what they know, how they know what they

know, and why they believe what they know (Duschl, 2003).

Our findings also suggest that instructional responsiveness is a crucial aspect of scientific

inquiry teaching. That is, teachers need to continuously adapt instruction to students’ present level

of understanding rather than pursuing a more teacher-directed instructional agenda. In the

absence of the crucial step of using information about student learning, teachers may follow the

IRE/F sequence without gathering information about student learning and following an

instructional agenda that is unresponsive to evolving student learning. Doing so is a challenging

task, as it involves both planning and flexibility, or as Sawyer (2004) termed it, ‘‘disciplined

improvisation.’’

In drawing conclusions from the study, it is important to note differences in the level of

training, experience, and other contextual factors of these three teachers. Although Rob and Alex

were trained in science and would be expected to have a deeper understanding of the content area,

their practices were less consistent with the ESRU model than that of Danielle. This result suggests

that content knowledge alone may not be a sufficient precursor to conducting informal formative

assessment. Furthermore, Rob had more years of teaching experience than Danielle and Alex;

however, the near absence of informal formative assessment practices in his conversations reveals

that he may not be focusing on collecting information about student learning during class. Finally,

it is important to note that Rob and Alex were teaching seventh grade science in large middle

schools, where they saw more than 100 students each day. In contrast, Danielle taught sixth grade

science and mathematics at an elementary school, working with about 55–60 students each day.

Although it is impossible to disaggregate all the contextual factors associated with the differences

between the elementary and middle school setting given the design of this study, we at least note

that some of the differences in patterns of interactions may have been associated with the contexts

of each school.

As the science education community continues to struggle to define what it means to be an

instructionally responsive teacher in the context of scientific inquiry, being explicit about

the differences between IRE/F and ESRU questioning could help teachers to understand the

differences between asking questions for the purpose of recitation and asking questions for the

purpose of eliciting information about and improving student learning. Pre- and in-service

teachers alike may benefit from learning about the ESRU cycle as a way of thinking about

classroom discussions as assessment conversations, or opportunities to understand the students’

understanding and to move students toward learning goals in an informal manner.

Future research should pursue the course of the present study by applying the ESRU model of

informal formative assessment to more teachers across more lessons to see if the trend observed in

the small sample can be replicated. Furthermore, additional studies should consider the students’

responses in more detail than we included herein. For a classroom discussion to be truly dialogic,

teachers should ask questions of an open nature that leave room for student responses that are

longer than one or a few words and that contribute real substance to classroom conversations

(Scott, 1998).

78 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 23: 20163_ftp_002

As mentioned at the outset of this study, assessment, instruction, and curriculum need to be

aligned for education in order for educational reform to be successful. This alignment is, like other

reforms, predicated upon the teacher, who is the instrumental figure in any classroom. Providing

teachers with tools that will help them to integrate assessment into the course of everyday

instruction to meet the goals of a curriculum will help to realize educational reforms. Scientific

inquiry learning, which has been identified as an essential element of students’ experience in

science education (National Research Council, 2001), may thus be better achieved by speaking to

teachers about models of informal formative assessment, so that their instruction may be

continuously adaptive to meet student learning goals.

The findings and opinions expressed in this report do not reflect the positions or

policies of the National Institute on Student Achievement, Curriculum, and Assessment,

the Office of Educational Research and Improvement, or the U.S. Department of

Education.

Notes

1We acknowledge that this is not always the case. Some laboratory studies (Kruger & Dunning, 1999;

Markham, 1979) have indicated that less competent students do not have the necessary metacognitive skills

and/or processing requirements to be aware of the quality of their performance or to know what to do to

improve it. On the other hand, multiple classroom studies have shown the positive impact, on average, of

appropriate feedback on student performance, especially of low-competence students (see Black & Wiliam,

1998). We rely on these latter results because they take place during classroom instruction and are therefore

of greater relevance to the present study.2The focus of this study is confined to the teachers’ actions during the process of formative assessment.

The reason behind this decision is practical more than conceptual. Because information was collected on

videotape, requiring the teachers to use a microphone, students’ comments and questions were not always

captured.3Feedback can be considered only if it can lead to an improvement of student competence

(Ramaprasad, 1983; Sadler, 1989). If the information is simply recorded, passed to a third party, or if it is

too coded (e.g., a grade or a phrase such as ‘‘good!’’) to lead to an appropriate action, the main purpose of

the feedback is lost and can even be counterproductive (Sadler, 1989, 1998). Effective feedback should lead

students to be able to judge the quality of what they are producing and be able to monitor themselves during

the act of production (Sadler, 1989, 1998).4Although describing the final step of ESRU as feedback would clearly be more appropriate in the

context of informal formative assessment, we employ the term ‘‘use’’ rather than ‘‘feedback’’ to make a

clear distinction between the IRF pattern and the ESRU pattern.5One of Danielle’s transcripts was lost due to low battery charge.6The low intercoder reliability coefficient was due mainly to the small sample of transcripts

considered. If coders identified one or two ESRUs in a different category, either within or between types,

this could easily change the coefficient. We acknowledge that sample size for evaluating consistency across

coders is small.7We did not include in our analysis the information on open-ended questions (a single open-ended

question that asks students to explain why things sink and float, with supporting examples and evidence)

because the information gathered was similar to the interpretation of the graph. The information will be

analyzed at a later point.

INFORMAL ASSESSMENT PRACTICES 79

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 24: 20163_ftp_002

Ap

pen

dix

Teacher

AsksStudentsto

:StudentsAre

Asked

to:

Defi

ne

con

cep

t(s)

Pro

vid

eac

tual

or

po

ten

tial

defi

nit

ion

sfo

rco

nce

pts

;e.

g.,

‘‘W

hat

ism

ass?

’’U

se,

app

ly,

and

com

par

eco

nce

pt(

s).

Mak

eu

seo

fo

rap

ply

pre

vio

usl

yd

efin

edco

nce

pts

,o

rto

com

par

e/c

on

tras

tco

nce

pts

wit

hea

cho

ther

;e.

g.,

‘‘Is

this

anan

om

aly

?’’

or

‘‘W

hat

isth

ed

iffe

ren

ceb

etw

een

mas

san

dvo

lum

e?’’

Use

and

app

lyk

now

np

roce

du

res.

Sh

are

exp

erie

nce

so

uts

ide

the

clas

sro

om

.P

rov

ide

exp

erie

nce

sth

eyh

ave

had

ou

tsid

eth

eco

nte

xt

of

the

pre

sen

tsc

ien

cecl

assr

oo

mto

con

ten

tre

late

dto

the

curr

ent

dis

cuss

ion

.S

har

eex

per

ien

ces

insi

de

the

clas

sro

om

.P

rov

ide

resp

on

ses

no

tb

ased

on

ob

serv

atio

ns

(e.g

.,h

w).

Rea

do

rd

isp

lay

resp

on

ses

fro

mh

om

ewo

rko

rso

me

oth

erfo

rmat

that

wer

en

ot

init

iall

yg

ath

ered

aso

bse

rvat

ion

sP

rov

ide

ob

serv

atio

ns.

Sh

are

thei

ro

bse

rvat

ion

s.T

hes

eco

uld

be

eith

ero

ral

or

wri

tten

.P

rov

ide

pre

dic

tio

ns/h

yp

oth

esis

.S

har

eth

eir

pre

dic

tio

ns

and

hy

po

thes

es.

Th

ese

cou

ldb

eei

ther

ora

lo

rw

ritt

en.

Ex

pla

nat

ion

s.S

har

eth

eir

exp

lan

atio

ns.

Th

ese

cou

ldb

eei

ther

ora

lo

rw

ritt

en.

Pro

vid

eev

iden

cean

dex

amp

les.

Pro

vid

eev

iden

cesu

pp

ort

ing

evid

ence

and

exam

ple

sco

llec

ted

ina

scie

nti

fic

man

ner

.In

terp

reti

ng

dat

a/r

esu

lts,

gra

ph

s.R

elat

eev

iden

ceto

exp

lan

atio

ns.

Co

nn

ect

evid

ence

coll

ecte

dd

uri

ng

clas

sto

anex

pla

nat

ion

.E

val

uat

eq

ual

ity

of

evid

ence

(pro

mo

tes)

.E

xam

ine

the

qu

alit

yo

fev

iden

cep

rov

ided

for

acl

aim

.C

om

par

e/c

on

tras

tid

eas.

Co

mp

are

and

con

tras

tth

eid

eas

of

oth

erst

ud

ents

.S

ug

ges

th

yp

oth

etic

alp

roce

du

re/e

xp

erim

enta

lp

lan

.P

rop

ose

ah

yp

oth

etic

alp

roce

du

reo

rex

per

imen

tth

atco

uld

be

per

form

edto

inves

tig

ate

ap

rob

lem

.Teacher

:D

efin

esP

rov

ides

ad

efin

itio

nto

stu

den

ts.

Cla

rifi

es/e

lab

ora

tes.

Pro

vid

esm

ore

info

rmat

ion

tofu

rth

ertu

ne

or

mak

ecl

eare

ra

pre

vio

us

stat

emen

tm

ade

by

ast

ud

ent

or

the

teac

her

.T

akes

vo

tes.

Co

un

tsh

ow

man

yst

ud

ents

agre

ew

ith

aco

nce

pti

on

,a

stat

emen

t,a

resp

on

se,

pre

dic

tio

n,

etc.

Co

mp

ares

/co

ntr

asts

stu

den

ts’

resp

on

ses/

exp

lan

atio

ns.

Op

enly

com

par

eso

rco

ntr

asts

dif

fere

nt

po

ints

of

vie

wex

pre

ssed

by

the

stu

den

ts.

Dri

ves

stu

den

tsto

ach

ieve

con

sen

sus.

Co

mp

ares

and

con

tras

tsth

ere

spo

nse

so

fst

ud

ents

,in

dis

cuss

ion

or

vis

ual

ly(b

oar

d/o

ver

hea

d).

Rel

ates

evid

ence

and

exp

lan

atio

ns

(WT

SF

).C

on

nec

tsev

iden

ceco

llec

ted

incl

ass

tosc

ien

tifi

cex

pla

nat

ion

s.R

epea

ts/p

arap

hra

ses

stu

den

tw

ord

s.R

epea

tsver

bat

imw

ith

the

om

issi

on

or

adju

stm

ent

of

on

lya

few

wo

rds

the

stat

emen

to

fa

stu

den

t.T

he

par

aph

rasi

ng

or

rep

eati

ng

do

esn

ot

chan

ge

or

con

trib

ute

add

itio

nal

mea

nin

gto

the

stu

den

t’s

stat

emen

t.R

evo

ices

stu

den

tw

ord

sb

y:

(1)

inco

rpo

rati

ng

stu

den

tco

ntr

ibu

tio

nin

toth

ecl

ass

conver

sati

on

;o

r(2

)ac

kn

ow

led

gin

gst

ud

ent

con

trib

u-

tio

ns.

Su

mm

ariz

esw

hat

stu

den

tsa

id.

Inco

rpo

rate

sst

ud

ents

’w

ord

sin

toth

efl

ow

of

acl

assr

oo

mco

nver

sati

on

by

usi

ng

sele

cted

stu

den

ts’

wo

rds

inan

on

go

ing

trai

no

fth

ou

gh

t.T

his

invo

lves

the

teac

her

sm

od

ify

ing

or

bu

ild

ing

up

on

the

stu

den

ts’

ori

gin

alm

ean

ing

tofu

rth

era

po

int

or

to‘‘

toss

’’a

com

men

tb

ack

toa

stu

den

tw

ith

new

wo

rdin

gto

see

ifth

ete

ach

erh

asg

rasp

edth

est

ud

ent’

so

rig

inal

mea

nin

g.

80 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 25: 20163_ftp_002

Do

esco

mp

reh

ensi

on

chec

kin

g(f

actu

al,

op

en-e

nd

edq

ues

tio

ns)

Ask

sst

ud

ents

qu

esti

on

so

fa

fact

ual

nat

ure

,bu

tn

ot

ina

yes

/no

or

fill

-in

-th

e-b

lan

km

ann

er.

Do

esco

mp

reh

ensi

on

chec

kin

g(m

on

ito

rin

gst

ud

ents

).H

alts

the

cou

rse

of

ad

iscu

ssio

nto

see

ifst

ud

ents

are

‘‘w

ith

it’’

;A

rey

ou

wit

hm

e?Is

this

mak

ing

sen

se?

Pro

vid

esev

alu

ativ

ere

spo

nse

s(c

orr

ect/

inco

rrec

t).

Pro

vid

esst

ud

ents

wit

hev

alu

ativ

ere

spo

nse

sth

atin

dic

ate

or

sug

ges

tth

est

ud

ent

isco

rrec

to

rin

corr

ect.

Th

ism

ayin

clu

de

resp

on

ses

that

do

no

tin

vo

lve

eval

uat

ive

lan

gu

age,

bu

tth

atar

eco

ntr

ary

tow

hat

ast

ud

ent

said

wit

hco

rrec

tive

inte

nt;

e.g

.,Y

es!

Go

od

!O

rS

:w

eig

ht?

T:

mas

s.P

rov

ides

feed

bac

k:

des

crip

tive

or

exp

lici

tim

pro

vem

ents

.P

rov

ides

stu

den

tsw

ith

feed

bac

kth

atis

eith

erd

escr

ipti

ve

of

stu

den

tre

spo

nse

ina

po

siti

ve

or

neg

ativ

ew

ay,

or

stat

esex

pli

citl

yh

ow

stu

den

tp

erfo

rman

ceca

nb

eim

pro

ved

.P

rom

ote

sm

akin

gse

nse

.P

rom

ote

sco

nn

ecti

on

s—Is

this

wh

aty

ou

exp

ecte

d?

Isth

ere

any

thin

gth

atd

oes

no

tfi

t?D

oes

yo

uh

yp

oth

esis

mak

ese

nse

wit

hw

hat

yo

uk

now

?M

od

els

how

toco

mm

un

icat

esc

ien

tifi

ck

now

led

ge.

En

acts

am

eth

od

for

scie

nti

fic

com

mu

nic

atio

nfo

rst

ud

ents

.M

od

els

pro

cess

skil

l.E

nac

tsa

pro

ced

ure

for

stu

den

ts.

Wai

tsfo

rst

ud

ent

tore

spo

nd

(wai

t-ti

me)

.A

llow

sa

no

tice

able

amo

un

to

fti

me

toel

apse

afte

ras

kin

ga

qu

esti

on

and

bef

ore

stu

den

tsre

spo

nd

,o

rb

efo

red

eco

nst

ruct

ing

or

elab

ora

tin

go

nth

eq

ues

tio

n.

Mak

esle

arn

ing

go

als

exp

lici

t.S

tate

so

rre

fers

toth

ele

arn

ing

go

als

for

the

less

on

,u

nit

,d

ay,

or

yea

r.O

rien

tsst

ud

ents

/set

sa

task

.S

ets

ata

skto

stu

den

tso

ro

rien

tsh

ow

tod

oa

task

Mak

esex

pli

cit

clas

sro

om

exp

ecta

tio

ns/s

tan

dar

ds.

Mak

esex

pli

cit

exp

ecta

tio

ns

for

the

clas

sro

om

,su

chas

beh

avio

ro

rm

anag

emen

tro

uti

nes

,o

rst

and

ard

s,su

chas

how

lab

rep

ort

sar

eto

be

wri

tten

,et

c.P

rov

ides

/rev

iew

scr

iter

ia.

Mak

esex

pli

cit

the

crit

eria

or

stan

dar

ds

by

wh

ich

stu

den

ts’

per

form

ance

wil

lb

eev

alu

ated

.C

aptu

res/d

isp

lay

sst

ud

ent

resp

on

ses/e

xp

lan

atio

ns.

Use

sso

me

man

ner

of

cap

turi

ng

and

dis

pla

yin

gst

ud

ents

’re

spo

nse

s,su

chas

by

wri

tin

gth

emo

nth

eb

oar

do

rover

hea

d,o

rh

avin

gst

ud

ents

wri

teth

eir

resp

on

ses

on

po

ster

s,w

hic

har

eth

end

isp

lay

edto

the

clas

s.M

akes

con

nec

tio

ns

top

rev

iou

sle

arn

ing

.M

akes

aco

nn

ecti

on

bet

wee

np

rese

nt

lear

nin

gto

lear

nin

gth

ato

ccu

red

pre

vio

usl

yin

the

clas

sro

om

con

tex

tR

efer

sto

nat

ure

of

scie

nce

.M

akes

are

fere

nce

toth

en

atu

reo

fsc

ien

ceo

rth

eac

tio

ns

or

real

scie

nti

sts,

such

as,

e.g

.,sc

ien

tist

sm

ake

ob

serv

atio

ns,

it’s

imp

ort

ant

tote

sto

nly

on

evar

iab

le,

etc.

Co

nn

ects

top

icw

ith

the

real

-wo

rld

.C

on

nec

tsth

ep

rese

nt

top

icw

ith

are

al-w

orl

dap

pli

cati

on

or

exam

ple

.P

rom

ote

ssh

arin

gst

ud

ents

thin

kin

g.

Pro

mo

tes

shar

ing

stu

den

ts’

thin

kin

gb

ym

akin

gp

rese

nta

tio

ns

of

thei

rw

ork

toth

ew

ho

lecl

ass.

Pro

mo

tes

deb

atin

g/d

iscu

ssin

gam

on

gst

ud

ents

.P

rom

ote

sst

ud

ents

spea

kin

gto

each

oth

erin

dep

end

entl

yo

fth

ete

ach

erin

aw

ho

le-c

lass

sett

ing

.P

rom

ote

sst

ud

ents

exp

lori

ng

thei

row

nid

eas.

En

cou

rag

esst

ud

ents

toad

dre

sso

rlo

ok

for

info

rmat

ion

reg

ard

ing

issu

eso

fth

eir

ow

nco

nce

rn.

Ex

plo

res

stu

den

ts’

idea

s.P

erfo

rms/d

emo

nst

rate

sin

resp

on

seto

ast

ud

ent’

sid

ea.

Inte

rru

pts

flow

of

dis

cuss

ion

or

stu

den

tre

spo

nse

s.C

uts

off

ast

ud

ent

mak

ing

ast

atem

ent,

or

inte

rru

pts

or

dis

turb

sth

eco

urs

eo

fco

nver

sati

on

.A

sks

for

yes

/no

or

fill

-in

-th

e-b

lan

kan

swer

s.A

sks

clo

sed

-en

ded

,re

cita

tio

n-s

tyle

qu

esti

on

so

fa

yes

/no

nat

ure

,o

rw

her

ea

fill

-in

-th

e-b

lan

k,cl

earl

yex

pec

ted

resp

on

seis

sug

ges

ted

.A

sks

rep

aire

dq

ues

tio

ns-

no

op

po

rtu

nit

yfo

rth

est

ud

ent

tore

spo

nd

.A

sks

aq

ues

tio

nth

atis

rhet

ori

cal,

or

wh

ich

the

teac

her

do

esn

ot

pro

vid

est

ud

ents

ano

pp

ort

un

ity

tore

spo

nd

by

ask

ing

ano

ther

qu

esti

on

,m

ov

ing

on

,o

ran

swer

ing

the

qu

esti

on

.Other

codes

On

-tas

kbu

to

ff-i

nte

rest

Inau

dib

lete

ach

erIn

aud

ible

stu

den

t(s)

INFORMAL ASSESSMENT PRACTICES 81

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 26: 20163_ftp_002

References

Bell, B., & Cowie, B. (2001). Formative assessment and science education. Dordrecht:

Kluwer.

Black, P. (1993). Formative and summative assessment by teachers. Studies in Science

Education, 21, 49–97.

Black, P., & Wiliam, D. (1998). Assessment and classroom learning. Assessment in

Education, 5, 7–74.

Brophy, J.E., & Good, T.L. (1986). Teacher behavior and student achievement. In M.C.

Wittrock (Ed.), Handbook of research on teaching (3rd ed., pp. 328–375). New York:

Macmillan.

Cazden, C.B. (2001). Classroom discourse: The language of teaching and learning (2nd ed.).

Portmouth, NH: Heinemann.

Christie, F. (2002). Classroom discourse analysis: A functional perspective. London:

Continuum.

Clarke, S. (2000, April). Closing the gap through feedback in formative assessment: Effective

distance marking in elementary schools in England. Paper presented at the AERA annual meeting.

New Orleans, LA.

Curriculum Research and Development Group. (2003). Foundational approaches in science

teaching I. Retrieved June 1, 2004, from www.hawaii.edu/crdg/FAST.pdf

Driver, R., Leach, J., Millar, R., & Scott, P. (1996). Young people’s images of science.

Buckingham, UK: Open University Press.

Duschl, R.A. (2000). Making the nature of science explicit. In R. Millar, J. Leach, &

J. Osborne (Eds.), Improving science education: The contribution of research. Buckingham, UK:

Open University Press.

Duschl, R.A. (2003). Assessment of inquiry. In J.M. Atkin & J.E. Coffey (Eds.), Everyday

assessment in the science classroom (pp. 41–59). Washington, DC: National Science Teachers

Association Press.

Duschl, R.A., & Gitomer, D.H. (1997). Strategies and challenges to changing the focus of

assessment and instruction in science classrooms. Educational Assessment, 4, 37–73.

Duschl, R.A., & Grandy, R.E. (2005, February). Reconsidering the character and role of

inquiry in school science: Framing the debates. Paper presented at the Inquiry Conference on

Developing a Consensus Research Agenda, Piscataway, NJ.

Edwards, D., & Mercer, N. (1987). Common Knowledge: The development of understanding

in the classroom. London: Routledge.

Edwards, A.D., & Westgate, D.P.G. (1994). Investigating classroom talk (2nd ed.). London:

Westgate.

Kelly, G.J., & Crawford, T. (1997). An ethnographic investigation of the discourse processes

of school science. Science Education, 81, 533–559.

Kruger, J., & Dunning, D. (1999). Unskilled and unaware of it: How difficulties in

recognizing one’s own incompetence lead to inflated self-assessment. Journal of Personality and

Social Psychology, 77, 1121–1134.

Lemke, J.L. (1990). Talking science: Language, learning and values. Norwood, NJ: Ablex.

Markham, E.M. (1979). Realizing that you don’t understand: Elementary school children’s

awareness of inconsistencies. Child Development, 50, 643–655.

Marks, G., & Mousley, J. (1990). Mathematics education and genre: Dare we make the

process writing mistake again? Language and Education, 4, 117–135.

82 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 27: 20163_ftp_002

Martin, J.R. (1989). Factual writing: Exploring and challenging social reality. Oxford, UK:

Oxford University Press.

Martin, J.R. (1993). Literacy in science: Learning to handle text as technology. In M.A.K.

Halliday & J.R. Martin (Eds.), Writing science: Literacy and discursive power (pp. 166–202).

Pittsburgh, PA: University of Pittsburgh Press.

National Research Council. (1999). The assessment of science meets the science of

assessment. Board on Testing and Assessment Commission on Behavioral and Social Sciences

and Education. Washington, DC: National Academy Press.

National Research Council. (2001). Inquiry and the national science education standards.

Washington, DC: National Academy Press.

Nystrand, M., & Gamoran, A. (1991). Instructional discourse, student engagement, and

literature achievement. Research in the Teaching of English, 25, 261–290.

O’Connor, M.C., & Michaels, S. (1993). Aligning academic task and participation status

through revoicing: Analysis of a classroom discourse strategy. Anthropology and Education

Quarterly, 24, 318–335.

Pottenger, F., & Young, D. (1992). FAST 1: The local environment. Manoa, HI: Curriculum

Research and Development Group, University of Hawaii.

Ramaprasad, A. (1983). On the definition of feedback. Behavioral Science, 28, 4–13.

Resnick, L.B., Salmon, M., Zeitz, C.M., Wathen, S.H., & Holowchak, M. (1993). Reasoning

in conversation. Cognition and Instruction, 11, 347–364.

Redfield, D.L., & Rousseau, E.W. (1981). A meta-analysis of experimental research on

teacher questioning behavior. Review of Educational Research, 511, 237–245.

Rogg, S., & Kahle, J.B. (1997). Middle level standards-based inventory. Oxford, OH:

University of Ohio, Miami.

Roth, W.-M. (1996). Teacher questioning in an open-inquiry learning environment:

Interactions of context, content, and student response. Journal of Research in Science Teaching,

33, 709–736.

Sadler, D.R. (1989). Formative assessment and the design of instructional systems.

Instructional Science, 18, 119–144.

Sadler, D.R. (1998). Formative assessment: Revisiting the territory. Assessment in Education,

5, 77–84.

Sawyer, R.K. (2004). Creative teaching: Collaborative discussion as disciplined improvisa-

tion. Educational Researcher, 33, 12–20.

Schwab, J.J. (1962). The concept of the structure of a discipline. The Educational Record, 43,

197–205.

Scott, P. (1998). Teacher talk and meaning making in science classrooms: A Vygotskyian

analysis and review. Studies in Science Education, 32, 45–80.

Shavelson, R.J., Black, P., Wiliam, D., & Coffey, J. On aligning summative and

formative functions in the design of large-scale assessment systems. Paper submitted for

publication.

Shavelson, R.J., & Young, D. (2000). Embedding assessments in the FAST curriculum: On

the beginning the romance among curriculum, teaching and assessment. Proposal submitted

at the Elementary, Secondary and Informal Education Division at the National Science

Foundation.

Shepard, L. (2003). Reconsidering large-scale assessment to heighten its relevance to

learning. In J.M. Atkin & J.E. Coffey (Eds.), Everyday assessment in the science classroom

(pp. 121–146). Washington, DC. National Science Teachers Association Press.

INFORMAL ASSESSMENT PRACTICES 83

Journal of Research in Science Teaching. DOI 10.1002/tea

Page 28: 20163_ftp_002

van Zee, E.H., Iwasyk, M., Kurose, A., Simpson, D., & Wild, J. (2001). Student and teacher

questioning during conversations about science. Journal of Research in Science Teaching, 38,

159–190.

Yin, Y. (2004). Formative assessment influence on students’ science learning and motivation.

Proposal to the American Educational Research Association.

Zellenmayer, M. (1989). The study of teachers’ written feedback to students’ writing:

Changes in theoretical considerations and the expansion of research contexts. Instructional

Science, 18, 145–165.

84 RUIZ-PRIMO AND FURTAK

Journal of Research in Science Teaching. DOI 10.1002/tea