Annual Report to the Teacher Education Accreditation ...research.steinhardt.nyu.edu/scmsAdmin/...FINAL DRAFT June 10, 2013 Department of Teaching and Learning The Steinhardt School

1

Annual Report to the Teacher Education Accreditation

Council

New York University

First Post-Continuing-Accreditation Year

June 2013

FINAL DRAFT June 10, 2013

Department of Teaching and Learning

The Steinhardt School of Culture, Education, and Human Development

New York University

2

CONTENTS

INTRODUCTION 4

EVIDENCE BASE 4

PROGRAM OPTIONS 5

UPDATED PROGRAM OUTCOMES 8

DRSTOS-R 8

New York State Teacher Certification Exams (NYSTCE) 9

Student Teacher End-of-Term Feedback Surveys (ETFQ) 12

Educational Beliefs Multicultural Awareness Scale (EBMAS) 13

Grade Point Averages 14

Program Exit and Follow-Up Surveys 15

Graduate Employment and Retention 18

Appendix A. NYCDOE Teacher Education Program Report: NYU

Appendix B. EBMAS report

ATTACHMENT (APPENDIX E)

3

Tables

Table 1. Program options, completers, and enrollments 7

Table 2. Percentage of late-placement student teachers meeting standards

on the Domain Referenced Student Teacher Observation Scale-Revised (DRSTOS-R)

by academic year 8

Table 3. Summary of performance on DRSTOS-R Total Scores for student

teachers in their last placements by program certification areas, fall 2010 – spring 2012 9

Table 4. Mean scaled scores, effect sizes, and passing rates for teacher-education

graduates on the NYSTCE exams: Classes of 2011 & 2012 10

Table 5. Mean scores on the ETFQ Claim Scales for teacher-education

students in last student teaching placements (Classes of 2011 and 2012) 13

Table 6. Mean EBMAS scale scores by degree and year compared to the program

standard of 4.50 14

Table 7. Mean GPAs of NYU BS & MA teacher education graduates

by claims (Class of 2012) 15

Table 8. Numbers and percents of Steinhardt teacher-education program completers

who reported on the Program Exit Survey that their programs prepared

them very or moderately well to begin teaching: Classes of '11 & '12 16

Table 9. Numbers and percents of Steinhardt teacher-education program completers

who reported on the One-Year Follow-Up Survey that their programs prepared

them very or moderately well to begin teaching: Classes of '11 & ‟12 17

Table 10. Comparison of the demographics of NYC schools in which NYU graduates

first taught and all NYC schools disaggregated by school type (Classes of 2006 – „11) 19

Table 11. Retention status as of Sept. 2012 of Steinhardt graduates who began teaching

in NYC public schools within one year of graduation (Classes of 2006 - 2011) 20

Figure 1. Mean scaled scores for NYSTCE Content Specialty Tests

(BS graduates 2011 – 12) 11

Figure 2. Mean scaled scores for NYSTCE Content Specialty Tests

(MA graduates 2011 – 12) 12

4

INTRODUCTION Having been granted continuing accreditation of our teacher education program on June

11, 2012, this is the first annual report submitted by NYU‟s Steinhardt School of Culture,

Education, and Human Development to the Teacher Education Accreditation Council (TEAC).

This report focuses on our self-study activities for the two academic years, 2010 – 11 and 2011 –

12. Prepared according to the specifications on TEAC‟s web site, the report includes (1) an

update of Appendix E, the evidence that the program‟s self study relies upon, (2) an update of

the Table of Program Options, including student enrollment, graduation numbers, and

descriptions of three program options developed since the Brief, and (3) updates of the data

tables in the Results Section of the Inquiry Brief that support the program‟s claims. In addition,

the report describes changes in the measures that are being used in the ongoing self-study of the

effectiveness of Steinhardt‟s teacher education programs and the most recent results from the

analyses of data.

Analyses of the new data indicate that Steinhardt‟s teacher education program continues

to meet the claims for the development of competent, caring, and confident teachers who are

committed to working in inner-city schools. Using data to inform program planning and

improvement by faculty and administrators is a cultivated tradition at Steinhardt. Toward that

end, the findings of this report have been shared and discussed at several faculty venues. First,

thefindings disaggregated by Teacher Education Program Areas were presented to the faculty of

Teaching and Learning at the April 10, 2013 department meeting. Second, detailed results were

delivered in a PowerPoint presentation at a meeting of the Teaching and Learning faculty on

April 15th

, which was emailed to all Teacher Education Faculty for follow-up discussions at

individual program area meetings. Last, the data were also reviewed at the April 17th

meeting of

the Teacher Education Working Group(TEWG), which discussed areas of concern that will be

addressed in the fall. TEWG will also plan an “internal audit” during the 2013-2014 academic

year.

THE EVIDENCE BASE The updated Appendix E (see attachment) details the evidence that NYU continues to

collect, analyze, and report to faculty to assess the effectiveness of its teacher education program

and inform continuous program improvement. The core of the evidence base remains in place

with some modifications and additions to measures and methods that are being leveraged by the

institution of new teacher evaluation and certification systems by the New York State

Department of Education (NYSED), new investments in assessment and accountability at

Steinhardt, and the results of ongoing research on the measurement of teacher and program

effectiveness.

First, the new NYSED systems are upgrading the data that will be available as evidence

in several ways. Beginning in 2014, aspiring teachers will have to pass the Teacher Performance

Assessment (edTPA), in order to obtain initial certification. The data from the edTPA will

provide a rich measure of our students‟ practice-based skills benchmarked against a large

national data base. We plan on using the TPA to supplement the home-grown DRSTOS-R,

which has become institutionalized at Steinhardt as a valid, reliable, and useful measure of

5

developing pedagogical proficiency. The new teacher evaluation system will yield effectiveness

ratings for graduates teaching in New York State public schools, part of which will be based on a

new Growth Percentile Measure (GPM) that uses the standardized test performance of the pupils

they teach. We are negotiating a process to obtain the effectiveness ratings and GPM for our

graduates in the same way that we obtained Value-Added Modeling data from the New York

City Department of Education for the most recent Brief.

Second, building on the foundation provided by Steinhardt‟s Center for Research on

Teaching and Learning (CRTL) in the assessment and evaluation of our teacher education

program, Steinhardt is launching a new Center for Research on Higher Education Outcomes

(CRHEO) this summer. CRHEO will continue to maintain the extant CRTL evidence base while

exploring and developing new methods and measures to assess the effectiveness of clinically-

based training programs. As part of the efforts to expand the evidence base with authentic

performance-based assessments, Steinhardt is conducting due diligence on electronic portfolio

systems for an anticipated pilot in the near future.

Third, this year for the first time, the New York City Department of Education

(NYCDOE) shared data that they have collected on NYU graduates teaching in the NYC public

schools. One caveat is that the graduates included in the NYCDOE report include all graduates

of NYU, including those who were not educated at the Steinhardt School. Nevertheless, these

data represent an independent examination and confirm our conclusions that we are preparing

high quality teachers. These data can be found in Appendix A..

Last, Steinhardt continues to conduct research on the validity and reliability of its

measures for assessing teacher and program effectiveness. CRTL has been able to mine the large

and rich database that it has built to study the psychometric properties of its measures resulting in

improvements in the quality and usefulness of the data. An example is the recently completed

study of CRTL‟s Educational Beliefs and Multicultural Attitudes Scale (EBMAS), which found

new evidence supporting its validity and reliability. A copy of this article can be found in

Appendix B.

PROGRAM OPTIONS Table 1 displays the list of program options, including for each the number of completers

for the Class of 2012, which includes graduates in September 2011 and January and May 2012,

and the number of students enrolled in fall 2012. Three new program options have been added

since the Brief: Clinically-Based English Education (CREE), Clinically-Rich Integrated Science

(CRISP), and Teaching Spanish as a Foreign Language/and Teaching English as a Second

Language, joint program with the Graduate School of Arts and Sciences (GSAS). These new

program options are described below.

CREE is a residency-based program designed to provide intensive fieldwork,

combined with campus and online work, to prepare highlyqualified teachers of English

Language Arts. The program has beendeveloped in partnership with the Great Oaks

Foundation, which hoststhe residencies of its charter schools (with plans to expand to

other schools). Great Oaks is a not-for profit educational foundation whose mission is to

advancequality public education for poor children and families. It currently

6

operates a charter school in Newark NJ. This program leads to anAdvanced Certificate,

for initial certification in Teaching English aswell as a MA, for professional certification

in Teaching English.

CRISP is a residency-based MA, leading to initial/professional teachercertification The

program is designed to root teacher education deeply inthe daily life of schools struggling

to teach students challenged bypoverty and special needs, while at the same time

connecting bothresidents and their school-based mentors to the best practices of

science and of science education at NYU and throughout the city. Itspower to do these

things is not just a matter of design, however. Itderives too from the substantial history

of collaboration between NYUand its partner schools. The CRISP design has four major

and integratedcomponents: (1) an intensive clinical experience mentored by both school

and university faculty in a paid teaching residency within one ofthree high-need NYU

partner schools referred to here as host schools,(2) rigorous coursework taken in the

public schools drawing on the materials and expertise of the host schools and their

immediate neighborhood (e.g.,social service agencies, non-formal science centers, and

other nearbyschools) (3) rigorous coursework that draws on the learning resources

of a well networked research university (e.g. science departments,ties to the larger

science community within and beyond New York, andresearch in the learning and health

sciences), and (4) a performanceassessment system based on the prospective New York

State TeacherEducation Standards, and drawing on the work in progress of the

Teacher Performance Assessment Consortium. Across the four components is an

overarching emphasis on the use of technology to enhanceagency, mentoring, and

validity (the development of skills andknowledge that lead to pupil learning) in teacher

preparation.

The M.A. degree in Teaching Spanish as a Foreign Language and TeachingEnglish as a

Second Language (TESOL) provides students with top-rateprofessional training in three

areas: 1) mastery of the Spanishlanguage; 2) teaching Spanish as a foreign language; and

3) teachingEnglish as a Second Language (ESL). The M.A. leads to dual teacher

certification in Teaching a Foreign Language (Grades 7-12) and TESOL(All Grades).

The program entails two years of study. Year One takesplace in Madrid, Spain, where

students study Spanish language andparticipate in a teaching assistantship as English

teachers in Spanishschools. Year Two takes place in New York City, where students take

course work in language education and TESOL, and complete studentteaching

placements in New York City Schools. This Master‟s degreeprogram is a joint offering

of The Steinhardt School‟s Department ofTeaching and Learning, the Graduate School of

Arts and Sciences‟Department of Spanish and Portuguese, and NYU in Madrid. As such,

students gain the advantages available from the expertise of faculty

in both Schools and both locations.

7

Table 1. Program options, completers, and enrollments

Option Name Level (UG,

Grad) N completers

(Class of 2012) N enrolled (Fall 2012)

Teaching Educational Theatre, All Grades UG, grad 26 56

Teaching Music, All Grades UG, grad 29 104

Teaching Dance, All Grades Grad 17 26

Teaching Art, All Grades Grad 25 21

Childhood Education UG, grad 6 15

Early Childhood Education UG, grad 1 10

Teaching English, 7-12 UG, grad 32 74

Teaching a Foreign Language 7-12 (Chinese, French, German, Hebrew, Italian, Japanese, Latin, Russian, or

Spanish) UG, grad 14 32

Science Education (Teaching Biology, Chemistry, Physics & Earth Science, 7-12) * UG, grad 6 22

Teaching Mathematics, 7-12 * UG, grad 47 52

Teaching Social Science, 7-12 UG, grad 22 41

Bilingual Education * Grad 0 0

Literacy (B-6, 5-12) Grad 17 20

Teachers of English to Speakers of Other Languages * Grad 17 25

Special Education: Childhood * Grad 4 11

Special Education: Early Childhood * Grad 2 3

Dual Certification: Educational Theatre, All Grades & English Education, 7-12 Grad 13 15

Dual Certification: Educational Theatre , All Grades & Social Studies, 7-12 Grad 7 5

Dual Certification: Childhood Education/Childhood Special Education * UG, grad 84 182

Dual Certification: Early Childhood Education/Early Childhood Special Education * UG, grad 47 102

Teaching French as a Foreign Language/TESOL Joint Degree GSAS * Grad 9 14

Teaching Spanish as a Foreign Language/TESOL Joint Degree GSAS (New) * Grad 9

Clinically Based English Education (New) Grad 7

Clinically Rich Integrated Science (New) Grad 18

Total N 449 906

*High-need areas

8

UPDATED PROGRAM OUTCOMES DRSTOS-R Table 2 presents DRSTOS-R ratings for students in their final student teaching placement

for the Classes of 2011 and 2012 for a total of 318 BS students and 430 MA students. The total

results across the two classes parallel those in the Brief for both BS and MA student teachers. As

in the Brief, the percentages of MA students continue to meet or exceed the program standard of

70% with a mean of at least 3.0 for all four domains and the Total Scale. The BS students

continued to fall below the program standard for all domains, except Professional

Responsibilities. However, the BS students did show large gains of about 10% points for the

three scales and overall, scoring at or above 70% for all domains for the first time since the scale

was first administered in 2005. Disaggregated results by program options are displayed in Table

3. For BS students, the program standard was met for only two groups, science and social

studies, with only three students assessed for the latter. For MA students, the program standard

was met for 10 of the 12 program options, with social studies and dance showing the highest

percentages scoring means of at least 3.0.

Table 2. Percentage of Late-Placement Student Teachers Meeting Standards on the Domain Referenced Student

Teacher Observation Scale Revised (DRSTOS-R) by Academic Year (See notes and footnotes on next page.)

Claims Scale Domain Number

of Items

Total (N)/

% Meeting

Standards

(Mean>=3.0)

2010 - 2011

2011 - 2012

Total**

BS Students

1 Planning &

Preparation 6

Total (N) 158 160 318 % Meeting

Standards 74.1% 68.1% 71.1%

3,4 Classroom

Environment 7


Standards 75.3% 69.4% 72.3%

2 Instruction 7*


Standards 76.6% 71.3% 73.9%

CCT

Learning

to Learn

Professional

Responsibilities 3


Standards 81.6% 83.1% 82.4%

3 Total Score 21*

Total (N) 158 160 318

% Meeting

Standards 72.2% 66.9% 69.5%

MA Students

1 Planning &

Preparation 6

Total (N) 210 220 430

% Meeting

Standards 80.0% 76.8% 78.4%

3,4 Classroom

Environment 7


Standards 80.5% 79.5% 80.0%

2 Instruction 7*


Standards 80.0% 81.8% 80.9%

CCT

Learning

to Learn

Professional

Responsibilities 3


Standards 86.7% 90.0% 88.4%

3 Total Score 21*

Total (N) 210 220 430

% Meeting

Standards 78.1% 78.2% 78.1%

9

Notes. Scale is (1) Not Yet Proficient (2) Partially Proficient (3) Entry Level Proficient (4) Proficient. The standard

for proficiency is 3.

*Two additional items were added to “Instruction” in spring 2012, increasing the number of items from 5 to 7

** Values in bold font meet the program standard of 80% >=3; values in bold italics fall within the 95% confidence

interval around the standard, which means they are not significantly lower than the standard, p<.05.

Table 3. Summary of performance on DRSTOS-R Total Scores for student teachers

in their last placements by program certification areas, fall 2010 – spring 2012

Program * N Assessed % >=3** M SD

Undergraduate

Dual Early Childhood 45 68.9% 3.23 0.54

Dual Childhood 169 74.0% 3.30 0.54

Ed. Theatre 18 66.7% 3.22 0.43

English 27 59.3% 3.12 0.44

Math 5 20.0% 2.80 0.18

MMS 5 60.0% 3.34 0.49

Music 26 57.7% 3.11 0.40

Science 3 100.0% 3.98 0.03

Social Studies 19 78.9% 3.34 0.39

Graduate

Early Childhood/Dual Early Childhood 48 83.3% 3.34 0.46

Childhood/ Dual Childhood 86 73.3% 3.27 0.45

Science 14 78.6% 3.56 0.29

English 42 76.2% 3.28 0.49

Social Studies 27 92.6% 3.53 0.36

Math 17 70.6% 3.09 0.34

MMS 66 81.8% 3.51 0.51

Educational Theatre 47 70.2% 3.32 0.41

Art 39 74.4% 3.19 0.31

Dance 22 100.0% 3.58 0.20

Music 21 66.7% 3.06 0.51

** Values in bold font meet the program standard of 80% >=3; values in bold italics fall within the 95% confidence

interval around the standard, which means they are not significantly lower than the standard, p<.05.

New York State Teacher Certification Exams (NYSTCE) Table 4 displays the results of the performance of graduates on the NYSTCE exams in

2011 and 2012. Consistent with the findings reported in the Brief, graduates continue to show

strong performance on the three sets of exams exceeding the dual program standards of 90%

passing and an effect size of at least 0.80, indicating that the mean scale score exceeded passing

to a large and educationally meaningful degree. Figures 1 and 2 display the mean scores on the

major Content Specialty Tests (CSTs) for BS and MA graduates respectively in 2011 and 2012

combined. As can be seen, the mean scores of both BS and MA Steinhardt students exceeded

the passing score of 220 for all CSTs, although there were large differences among the specialty

areas, with mean scores in math exceeding all other areas for both degree options.

10

Table 4. Mean scaled scores, effect sizes, and passing rates for teacher-education graduates on the NYSTCE exams: Classes of 2011 & 2012

Exam/Claim Degree Class N Scaled scores

% Passing Mean SD Effect size

LAST/ cross-cutting theme 1, Learning to Learn

BS

2011 73 271.6 15.4 3.35 100.0%

2012 113 264.2 21.2 2.09 92.9%

Total 186 267.1 19.4 2.43 95.7%

MA

2011 222 268.4 23.1 2.10 95.5%

2012 187 270.4 18.3 2.76 98.9%

Total 409 269.3 21.0 2.34 97.1%

ATS-W/Claim 2, Pedagogical Knowledge

BS

2011 79 267.7 17.0 2.80 97.5%

2012 108 269.0 20.0 2.45 94.4%

Total 186 267.1 19.4 2.43 95.7%

MA

2011 228 269.0 15.7 3.11 98.7%

2012 197 270.4 13.9 3.64 99.5%

Total 425 269.7 14.9 3.33 99.1%

CST/Claim 1, Content

Knowledge

BS

2011 126 255.0 21.2 1.65 94.4%

2012 163 249.8 23.0 1.30 91.4%

Total 289 252.1 22.4 1.43 92.7%

MA

2011 381 254.2 25.9 1.32 90.3%

2012 326 254.1 23.0 1.48 92.0%

Total 707 254.1 24.6 1.39 91.1%

* ES = Effect Size = SDs Above Passing = (MSS - 220)/SD; the program standard is an ES >= .80, large and meaningful

** Passing score = 220 on a scale of 100 – 300. The program standard is 90% passing.

*** If a student has multiple tests, data are based on the most recent exam

11

Figure 1

Note: N’s are as follows: Math = 15, El.Ed = 94, Social Studies = 15, Stud. With Disabilities = 93, Music = 23, English = 23

12

Figure 2

Note: N are as follows: Math= 56, Literacy = 19, For. Lang. = 67, El. Ed. = 131, ESOL = 72,

Studs. With Disabilities = 120, English = 53, Science = 20, Theater = 49, Visual Arts = 39,

Social Studies = 24, Dance = 32, Music = 20

Student Teacher End-of-Term Feedback Surveys (ETFQ) Table 5 displays the results of the assessment of Claims 1, 2, and 3 for the Classes of

2011 and 2012 using ETFQ data. The total mean scores for each of the three claim scales met

the criterion of 4.0 (nominally equivalent to a rating of “Well”) for both BS and MA program

finishers. For MA students, the means exceeded the program standard on all three claim scales

while for BS students, the means exceeded the standard for Claims 2 and 3 and was not

statistically significantly different from the standard for Claim 1. These results are consistent

with the findings in the Brief and indicate that program completers continue to meet program

standards on these two measures in the two years following the accreditation study.

13

Table 5. Mean scores on the ETFQ Claim Scales for teacher-education students in last student teaching placements (Classes of 2011 and 2012)

Claim Scale Statistic

Class

2011 2012 Total

Degree Degree Degree

BS MA BS MA BS MA Claim 1. Content

Knowledge

Mean 3.91 4.05 3.93 4.10 3.92 4.07

N 206 343 132 191 338 534

SD 0.86 0.90 0.96 0.96 0.90 0.92

Claim 2. Pedagogical Knowledge

Mean 4.00 4.12 4.11 4.22 4.04 4.16

N 206 343 132 191 338 534

SD 0.92 0.84 0.81 0.81 0.88 0.83

Claim 3. Clinical

Knowledge

Mean 4.08 4.18 4.08 4.20 4.08 4.19

N 206 343 132 191 338 534

SD 0.82 0.82 0.86 0.84 0.83 0.83

Notes: Claim 1 scale items are Items 9 and 18; Claim 2 scale items are Items 7 and 15; and Claim 3 scale items are Items 8, 11, 16, and 19. Items are measured on a 5-point Likert scale with scale values of (1) “Very Poorly”, (2) “Poorly”, (3) “Average”, (4) “Well”, and (5) “Very Well”.

* Total means in bold meet the program standard of 4.0; means in bold italics are not significantly different from the program standard of 4.0.

Educational Beliefs Multicultural Awareness Scale (EBMAS) Table 6 displays the comparison of mean EBMAS scale scores against the program

standard of 4.5 for BS and MA program finishers in the Classes of 2011 and 2012. Continued

research on the EBMAS found a factor structure that was different from the one that emerged in

earlier analyses and led to a change in the scoring to yield five scale scores associated with three

claims rather than the four reported in the Brief. Since the new scoring structure was based on

research using a much larger sample size and was better aligned with the theory underlying the

construction of EBMAS, the five-score structure will be used in subsequent assessments. As

shown in the table, two scales, Personal Teacher Efficacy 1 and 2 (PTE 1) and (PTE 2) are

associated with Claim 3, two, General Teacher Efficacy (GTE) and Social Justice (SJ) with claim

4, and one, Multicultural Awareness (MA) with Cross Cutting Theme 2. For both BS and MA

students, all observed means either met or were not statistically significantly different from the

program standard of 4.50, thereby supporting the claims. The highest mean scores were for MA

and SJ and the lowest for PTE 1 and PTE 2, especially for BS students. Overall, the results were

better than those in the Brief, suggesting progress in this measure during the two years since the

accreditation study (see Appendix B).

14

Table 6. Mean EBMAS scale scores by degree and year compared to the program standard of 4.50

Scale (Claim)** Year BS MA

N Mean SD M - 4.50 * N Mean SD M - 4.50 *

PTE 1. Personal Efficacy: Student Problem Solving

(Claim 3)

2010 - 11 54 4.70 0.66 0.20 114 4.40 0.80 -0.10

2011 - 12 68 4.55 0.88 0.05 120 4.51 0.67 0.01

Total 122 4.61 0.78 0.11 234 4.46 0.73 -0.04

PTE 2. Personal Efficacy: Student Success (Claim 3)

2010 - 11 54 4.25 0.76 -0.25 114 4.36 0.70 -0.14

2011 - 12 68 4.24 0.67 -0.26 119 4.49 0.64 -0.01

Total 122 4.24 0.71 -0.26 233 4.42 0.67 -0.08

GTE. General Teacher Efficacy

(Claim 4)

2010 - 11 54 4.94 0.90 0.44 114 4.77 0.95 0.27

2011 - 12 68 5.02 0.83 0.52 120 4.96 0.88 0.46

Total 122 4.99 0.86 0.49 234 4.87 0.91 0.37

MA, Multicultural Awareness(CCT 2)

2010 - 11 54 5.54 0.57 1.04 114 5.35 0.68 0.85

2011 - 12 68 5.50 0.66 1.00 120 5.50 0.49 1.00

Total 122 5.52 0.62 1.02 234 5.42 0.58 0.92

SJ, Social Justice(Claim 4)

2010 - 11 54 5.37 0.51 0.87 114 5.28 0.70 0.78

2011 - 12 68 5.34 0.61 0.84 119 5.39 0.52 0.89

Total 122 5.36 0.56 0.86 233 5.34 0.61 0.84

* Values in bold font indicate the program standard of 4.5 has been met or exceeded; values in bold italicsare not significantly different from the program standard. ** TEAC Claims: Claim 3, Clinical Competence; Claim 4, Caring Professional; CCT 2, Multicultural Perspective

Responses are measured on a 6-point scale of agreement as follows: (1) Strongly Disagree (2) Moderately Disagree (3) Slightly Disagree (4) Slightly Agree (5) Moderately Agree (6) Strongly Agree.

Grade Point Averages Table 7 presents the mean values for four types of GPAs associated with three claims and

Cross Cutting Theme 1 (CCT1), Learning to Learn, for BS and MA graduates in the class of

2012. As can be seen in the table, the program standard of 3.0 was exceeded for all three claims

by both BS and MA students; for CCT1, MA students exceeded the standard while the mean for

BS students did not differ significantly from the standard. The findings for the claims are

consistent with those reported in the Brief and better than those in the Brief for CCT1.

15

Table 7. Mean GPAs of NYU BS & MA teacher education graduates by claims (Class of 2012)

Claim GPA* Statistic BS** MA**

1 CK

Mean 3.04 3.46

SD 0.69 0.50

N 137 61

2 PK

Mean 3.56 3.84

SD 0.32 0.40

N 137 325

3 CS

Mean 3.85 3.87

SD 0.36 0.32

N 120 283

Cross-cutting

theme 1 CCT1

Mean 2.92 3.46

SD 0.90 0.50

N 105 61

* Types of GPA: CK=Content Knowledge; PK=Pedagogical Knowledge; CS=Clinical Skill; CCT1=Cross Cutting Theme, Learning to Learn ** Means in bold font meet or exceed the program standard of 3.0 on a 4-point scale; means in bold italics do not differ significantly from the standard of 3.0.

Program Exit and Follow-Up Surveys Tables 8 and 9 display the results of two surveys administered to BS and MA program

completers and graduates, respectively, in the classes of 2011 and 2012. The Program Exit

Survey (Table 8) is administered in May to program completers to elicit their perceptions of the

extent to which the program prepared them to begin teaching. The One-Year Follow-Up Survey

(Table 9) is administered to the same samples eight months after graduation to assess their

preparation for teaching after most had entered the teaching profession. In both tables, the items

are clustered by the claims and cross-cutting themes they address. As can be seen in Table 8, at

program exit BS students met the program standard of 80% feeling “Well” or “Very Well”

prepared to begin teaching with respect to Content Knowledge, Clinical Skill, and Cross-Cutting

Theme 2, Multicultural Perspective. They met the standard for four of the six items related to

Pedagogical Knowledge, falling short on “addressing the needs of students with limited English

proficiency” and “working with parents”, and two of the three items related to Caring

Professionals. Overall, the results for MA students were not as strong as those for the BS

students. They met the standard for all items related to Content Knowledge, Clinical Skill and

Cross-Cutting Theme 2, but met the standard for only three of the six Pedagogical Skill items

and none of the Caring Professional items. Neither BS nor MA students met the standard for

Cross-Cutting theme 3, Knowledge of Technology.

16

Table 8. Numbers and percents of Steinhardt teacher-education program completers who reported on the Program Exit Survey that their programs prepared them very or moderately well to begin teaching: Classes of '11 & '12

Claim How well did your teacher

education program prepare

you to:

Responded Very Well (4) or Moderately Well (3)

BS (N = 90) MA (N = 141)

VW (4) MW (3) (4+3) VW (4) MW (3) (4+3)

1. Content

knowledge

Have a mastery of your

subject area

N 35 39 74 55 48 103

% 38.9% 43.3% 82.2% 39.6% 34.5% 74.1%

1. Content

knowledge

Implement state/district

curriculum & standards

N 45 36 81 64 44 108

% 50.0% 40.0% 90.0% 46.0% 31.7% 77.7%

2.Pedagogical

knowledge Understand how students learn

N 47 38 85 75 51 126

% 52.2% 38.7% 90.9% 53.6% 36.4% 90.0%

2.Pedagogical

knowledge

Use different pedagogical

approaches

N 43 35 78 76 46 122

% 49.4% 40.2% 89.6% 54.3% 32.9% 87.2%

2.Pedagogical

knowledge

Use student performance

assessment techniques

N 47 32 79 66 43 109

% 53.4% 36.4% 89.8% 47.5% 30.9% 78.4%

2.Pedagogical

knowledge

Address needs of students

with disabilities

N 32 43 75 47 47 94

% 35.6% 47.8% 83.4% 34.1% 34.1% 68.2%

2.Pedagogical

knowledge


with limited English

proficiency

N 15 36 51 31 33 64

% 16.7% 40.0% 56.7% 22.5% 23.9% 46.4%

2.Pedagogical

knowledge Work with parents

N 20 36 56 20 41 61

% 22.5% 40.4% 62.9% 14.2% 29.1% 43.3%

3.Clinical

skill

Maintain order & discipline in

the classroom

N 44 36 80 56 62 118

% 48.9% 40.0% 88.9% 40.3% 44.6% 84.9%

3.Clinical

skill

Impact my students' ability to

learn

N 48 33 81 62 66 128

% 53.3% 36.7% 90.0% 44.3% 47.1% 91.4%

4. Caring

Professionals

Work collaboratively with

teachers, administrators and

other school personnel

N 36 32 68 52 50 102

% 40.4% 36.0% 76.4% 36.9% 35.5% 72.4%

4. Caring

Professionals

Identify & use resources

within the community where

you teach

N 30 40 70 41 53 94

% 33.3% 44.4% 77.7% 29.3% 37.9% 67.2%

4. Caring

Professionals

Participate as a stakeholder in

the community where you

teach

N 15 41 56 34 46 80

% 16.7% 45.6% 62.3% 24.3% 32.9% 57.2%

Cross-cutting

theme 2


from diverse cultures

N 44 34 78 66 41 107

% 49.4% 38.2% 87.6% 47.5% 29.5% 77.0%

Cross-cutting

theme 3

Integrate technology into

teaching

N 25 26 51 48 38 86

% 27.8% 28.9% 56.7% 34.0% 27.0% 61.0%

* Total percents in bold meet or exceed the program criterion of 80%; those in bold italics have the program criterion

within the 95% confidence interval for the observed value.

17

Table 9. Numbers and percents of Steinhardt teacher-education program completers who reported on the One-Year Follow-Up Survey that their programs prepared them very or moderately well to begin teaching: Classes of '11 &‘12

Claim How well did your teacher

education program prepare

you to:

Responded Very Well (4) or Moderately Well (3)

BS (N = 63) MA (N = 120)

VW (4) MW (3) (4+3) VW (4) MW (3) (4+3)

1.Content

knowledge

Have a mastery of your subject

area

N 28 25 53 37 55 92

% 44.4% 39.7% 84.1% 30.8% 45.8% 76.6%

1.Content

knowledge

Implement state/district

curriculum & standards

N 27 26 53 27 44 71

% 42.9% 41.3% 84.2% 22.5% 36.7% 59.2%

2.Pedagogical

knowledge Understand how students learn

N 39 15 54 45 48 93

% 61.9% 23.8% 85.7% 37.5% 40.0% 77.5%

2.Pedagogical

knowledge

Use different pedagogical

approaches

N 33 24 57 44 52 96

% 52.4% 38.1% 90.5% 36.7% 43.3% 80.0%

2.Pedagogical

knowledge

Use student performance

assessment techniques

N 30 26 56 34 53 87

% 47.6% 41.3% 88.9% 28.3% 44.2% 72.5%

2.Pedagogical

knowledge

Address needs of students with

disabilities

N 31 16 47 32 35 67

% 49.2% 25.4% 74.6% 26.7% 29.2% 55.9%

2.Pedagogical

knowledge

Address needs of students with

limited English proficiency

N 19 14 33 23 31 54

% 30.2% 22.2% 52.4% 19.2% 25.8% 45.0%

2.Pedagogical

knowledge Work with parents

N 18 23 41 15 31 46

% 28.6% 36.5% 65.1% 12.5% 25.8% 38.3%

3.Clinical

skill

Maintain order & discipline in

the classroom

N 21 20 41 15 42 57

% 33.3% 31.7% 65.0% 12.5% 35.0% 47.5%

3.Clinical

skill

Impact my students' ability to

learn

N 37 19 56 40 46 86

% 58.7% 30.2% 88.9% 33.3% 38.3% 71.6%

4. Caring

Professionals

Work collaboratively with

teachers, administrators and

other school personnel

N 32 20 52 40 40 80

% 50.8% 31.7% 82.5% 33.3% 33.3% 66.6%

4. Caring

Professionals

Identify & use resources within


teach

N 27 22 49 31 39 70

% 42.9% 34.9% 77.8% 25.8% 32.5% 58.3%

4. Caring

Professionals

Participate as a stakeholder in


teach

N 22 21 43 18 43 61

% 34.9% 33.3% 68.2% 15.0% 35.8% 50.8%

Cross-cutting

theme 2

Address needs of students from

diverse cultures

N 35 19 54 41 46 87

% 55.6% 30.2% 85.8% 34.2% 38.3% 72.5%

Cross-cutting

theme 3

Integrate technology into

teaching

N 23 24 47 38 33 71

% 36.5% 38.1% 74.6% 31.7% 27.5% 59.2%

* Total percents in bold meet or exceed the program criterion of 80%; those in bold italics have the program criterion

within the 95% confidence interval for the observed value.

18

As can be seen in Table 9, the BS graduates‟ perceptions of their preparation for teaching one

year after graduation were similar to the ones they had at program exit, while the MA students

felt less prepared after graduation than they did at program exit. MA students met program

standards on only three of the 15 items on the Follow-Up Survey, compared to eight of 15 on the

Program Exit Survey. For BS graduates, there were two noteworthy differences in their

responses to the two surveys. First, whereas they fell below standard at graduation in

Technology, their perceptions were higher on the Follow-Up survey and met the standard.

Second, they met standards on one of the two Clinical Skillitems on the Follow-Up survey

compared to two out of two at Program Exit.

The results of the two surveys for 2011 and 2012 are generally in line with those reported

in the Brief and suggest the need for continued work on improving the curriculum and

experiences of Steinhardt teacher education students in certain areas of teaching skills.

Graduate Employment and Retention Table 10 displays a comparison of the demographic characteristics of the NYC public

schools in which Steinhardt graduates from the classes of 2006 – 11 were employed and the

demographics of all NYC public schools at school level, elementary through high school. The

program standard is that the demographic characteristics for the schools of Steinhardt graduates

will by statistically similar to those of NYC public schools overall. As can be seen in the table,

the schools of Steinhardt graduates are highly diverse. Nevertheless, they tend to have

statistically significantly lower percentages of Black and Hispanic students eligible for free lunch

than NYC public schools overall. On the other hand, the middle schools of Steinhardt graduates

had higher percentages of ELL students. The differences in percentages of minority and poor

students are largely attributable to the tendency of Steinhardt graduates to be employed in

schools in District 2 in Manhattan, a district in which NYU is situated and one with lower

percentages of poor and minority students than the city overall. These results are similar to those

reported in the Brief.

Table 11 displays the results of an analysis of retention data obtained from the NYCDOE

for Steinhardt graduates from the classes of 2006 – 11 who were employed in NYC public

schools. The program standard is 70% of graduates remaining employed or leaving after serving

at least three years in the NYC public schools, a standard that is better than the average for new

teachers in the NYC public schools. This standard uses a single criterion as opposed to the

multiple criteria, differentiated by year of graduation, that were used in the Brief. The change in

the standard is intended to simplify tracking progress on this indicator, thereby increasing the

reliability of inferences based on the data. As can be seen in Table 11, the standard was met for

both BS and MA graduates from all classes. Overall, as of September 2012, 80% of all

graduates from the classes of 2006 – 11 who taught in NYC public schools remained employed

or left after serving at least three years. These results are consistent with the results reported in

the Brief and are higher than the 60% overall three-year retention rate cited in a staff report from

the New York City Council.1

1New York City Council (July 2009).A staff report of the New York City Council Investigation Division on teacher

attrition and retention. Retrieved on June 4, 2013 from

http://www.nyc.gov/html/records/pdf/govpub/1024teachersal.pdf

http://www.nyc.gov/html/records/pdf/govpub/1024teachersal.pdf

19

Table 10. Comparison of the demographics of NYC schools in which NYU graduates first taught

and all NYC schools disaggregated by school type (Classes of 2006 – ‘11)

Grads Schools

All NYC

Schools Diff. in

School Type Demographic

N

Grads Mean SD Mean SD Means*

Elementary %ELL 968 14.7 12.3 16.9 13.1 -2.2

%Spec. Ed. 989 17.3 6.7 16.9 6.3 0.4

% Black & Hispanic 989 63.1 24.2 70.8 31.3 -7.7

% Free lunch 989 61.7 27.7 68.7 23 -7.0

N Enrolled 989 655.8 284.1 639.3 277.9 16.5

Middle %ELL 327 16.4 13.7 11.1 12.2 5.3

%Spec. Ed. 384 17.9 6.6 16.6 7.4 1.3

% Black & Hispanic 384 73.5 27.3 81 25.1 -7.5

% Free lunch 384 68.7 20.8 68.9 19.3 -0.2

N Enrolled 384 691.8 428.7 584.6 419.2 107.2

K-8 %ELL 14 10.6 9.9 11.6 11 -1.0

%Spec. Ed. 14 13.6 8.2 16.6 6.7 -3.0

% Black & Hispanic 14 65.4 18.7 78.3 27.4 -12.9

% Free lunch 14 56.0 34.8 67.7 21.9 -11.7

N Enrolled 14 502.6 129.5 684.6 290.8 -182.0

High School %ELL 416 10.4 16.9 12.6 18.5 -2.2

%Spec. Ed. 418 11.9 6.6 12.8 6.8 -0.9

% Black & Hispanic 418 72.9 11.9 82.3 22.1 -9.4

% Free lunch 418 55.9 21.5 61.4 19.9 -5.5

N Enrolled 418 846.7 913.4 898.3 1027.2 -51.6

Note 1: School demographic data were not available for all graduates who were working in NYC

public schools

* Differences in bold italics are statistically significant at p < .05. The program standard is that the

means for the percent of at-risk students in the schools of NYU graduates will equal to or higher

than the means for all NYC public schools. The standard does not apply to enrollment.

20

Table 11. Retention status as of Sept. 2012 of Steinhardt graduates who began teaching

in NYC public schools within one year of graduation (Classes of 2006 – 2011)

Degree Class Statistic

Retention status * Total Hired

Left before 3

years Still

employed Left after 3 years

BS/BMUS

2006 N 6 18 10 34

% in class 17.6% 52.9% 29.4% 100.0%

2007 N 9 22 4 35

% in class 25.7% 62.9% 11.4% 100.0%

2008 N 7 28 1 36

% in class 19.4% 77.8% 2.8% 100.0%

2009 N 8 23 1 32

% in class 25.0% 71.9% 3.1% 100.0%

2010 N 7 23 0 30

% in class 23.3% 76.7% 0.0% 100.0%

2011 N 2 21 0 23

% in class 8.7% 91.3% 0.0% 100.0%

Total N 39 135 16 190

% in class 20.5% 71.1% 8.4% 100.0%

MA

2006 N 41 95 38 174

% in class 23.6% 54.6% 21.8% 100.0%

2007 N 39 139 26 204

% in class 19.1% 68.1% 12.7% 100.0%

2008 N 34 116 14 164

% in class 20.7% 70.7% 8.5% 100.0%

2009 N 31 119 4 154

% in class 20.1% 77.3% 2.6% 100.0%

2010 N 23 107 2 132

% in class 17.4% 81.1% 1.5% 100.0%

2011 N 14 92 0 106

% in class 13.2% 86.8% 0.0% 100.0%

Total N 182 668 84 934

% in class 19.5% 71.5% 9.0% 100.0%

Program standard is a total of 70% still employed or leaving after 3 years of service in NYCDOE public schools. All classes have met the standard.

21

APPENDIX A

APPENDIX B.

The Developing Teaching Dispositions of NYU Steinhardt’s

Teacher Education Students: An Analysis of Responses to the Educational

Beliefs and Multicultural Attitudes Scale (EBMAS)

Data Collected in Academic Years 2009-10 – 2011-12

March 21, 2013

Robert Tobias, Research

Consultant and

Director Retired of the Center for Research on Teaching Learning,

Steinhardt’s Department of Teaching and Learning

1

ABSTRAT

This report presents updated findings from the analysis of EBMAS, a component of NYU

Steinhardt’s assessment student and program assessment system, for the three academic years

2009-10 thru 2011-12. During that time, EBMAS was administered to 1,450 undergraduate and

graduate students who were at the beginning, middle, or end of their pre-service teacher-

education programs. The report presents findings from continued research on EBMAS, results

from the use of the scale to assess TEAC program clams, and analyses of the differences in

scores for students grouped by demographic, experiential, and program characteristics. These

findings update the results reported for a smaller dataset in NYU’s 2011 TEAC Inquiry Brief for

re-accreditation.

The findings lead to the overall conclusion that EBMAS has been a valid and reliable tool for

assessing the developing teaching dispositions of NYU Steinhardt teacher education students

and, consequently, the data have important implications for readiness to teach of the

graduates and the effectiveness of the program in preparing competent and caring educators.

NYU graduates generally have strong beliefs in the general efficacy of teaching to promote the

learning and positive behavior of all pupils, value social justice, and a strong awareness of and

positive attitude toward the importance of a multicultural perspective. They also have

moderate confidence in their personal efficacy to teach all students, although with less

certainty than their other beliefs. The exploration of differences in scores between students

grouped by demographic, experiential, and program characteristics revealed some differences

that warrant discussion among program faculty and administrators. In addition, recent

differences in the factor structure of the scale that emerged from PCA highlight the importance

of continuing research on its psychometric properties.

2

CONTENTS

ABSTRACT...................................................................................................................................................... 2

CONTENTS ..................................................................................................................................................... 3

TABLES........................................................................................................................................................... 4

INTRODUCTION ............................................................................................................................................. 5

BACKGROUND ON EBMAS ............................................................................................................................ 6

METHOD........................................................................................................................................................ 7

Data Collection.......................................................................................................................................... 7

Data ........................................................................................................................................................... 8

Data Analysis ............................................................................................................................................. 8

RESULTS ........................................................................................................................................................ 9

Description of the Participants ................................................................................................................. 9

Validity and Reliability............................................................................................................................. 12

Assessment of TEAC Claims .................................................................................................................... 16

Differences in Scale Scores by Student Demographics and Experience ................................................. 17

Perceived Effectiveness of the Steinhardt Teacher Education Program ................................................ 24

SUMMARY AND CONCLUSIONS .................................................................................................................. 26

REFERENCES CITED...................................................................................................................................... 28

3

TABLES

Table 1. N and percent of students taking EBMAS by academic year…………………………………………………….9

Table 2. Race/ethnicity of total sample and sample with data…………………………………………………………….10

Table 3. Number and percent of total sample and sample with data in certification areas…………………11

Table 4. Credits completed within degree groups for EBMAS respondents………………………………………….12

Table 5. Summary of ANOVA for differences in EBMAS subscale scores among students

at different stages of their programs (Undergraduates)………………………………………………………………………14

Table 6. Summary of T-test for differences in EBMAS subscale scores between new and

late stage students (graduate students)………………………………………………………………………………………………15

Table 7. Summary of ANOVA for tests of significance of the main and interaction

effects of year and degree on EBMAS subscale scores (late stage BS and MA students

in Classes of 2010 - 2012)……………………………………………………………………………………………………………………..15

Table 8. Mean EBMAS subscale scores by degree and year compared to the program

standard of 4.50 (Classes of 2010 - 2012)……………………………………………………………………………………………..17

Table 9. Summary of t-tests and ANOVAs for test of significance of differences in EBMAS

subscale scores by descriptive characteristics of late-stage undergraduate student

teachers (classes of 2010 - 12)………………………………………………………………………………………………………………18

Table 10. Homogeneous subsets of EBMAS PTE 1 means by certification area for late-stage

undergraduates (Classes of 2010 - 12)……………………………………………………………………………………………......18

Table 11. Homogeneous subsets of EBMAS PTE 2 means by race/ethnicity for late-stage

undergraduates (Classes of 2010 - 12)………………………………………………………………………………………………….19

Table 12. Homogeneous subsets of EBMAS GTE means by race/ethnicity for late-stage

undergraduates (Classes of 2010 - 12)………………………………………………………………………………………………….19

Table 13. Mean EBMAS subscale scores for late-stage undergraduate students with varying

types of student teaching experience (Classes of 2010 - 2012)……………………………………………………………..20

Table 14. Summary of t-tests and ANOVAs for test of significance of differences in

EBMAS subscale scores by descriptive characteristics of late-stage graduate student

teachers (classes of 2010 - 12)……………………………………………………………………………………………………………….21

Table 15. Homogeneous subsets of EBMAS GTE means by certification area for late-stage

graduate students (Classes of 2010 - 12)……………………………………………………………………………………………….21

4

Table 16. Homogeneous subsets of EBMAS MA means by certification area for late-stage

graduate students (Classes of 2010 - 12)………………………………………………………………………………………………22

Table 17. Homogeneous subsets of EBMAS PTE 1 means by race/ethnicity for late-stage


Table 18. Homogeneous subsets of EBMAS GTE means by race/ethnicity for late-stage


Table 19. Homogeneous subsets of EBMAS SJ means by race/ethnicity for late-stage


Table 20. Mean EBMAS subscale scores for late-stage graduate students by significant

descriptive characteristics (Classes of 2010 - 2012)………………………………………………………………………………24

Table 21. Summary of ANOVA comparing mean scores of late-stage undergraduates to

question 17 for the three Classes of 2009-10 - 2011 – 12……………………………………………………………………..25

Table 22. Summary of ANOVA comparing mean scores of late-stage graduate students

to question 17 for the three Classes of 2009-10 - 2011 – 12………………………………………………………………….25

Table 23. Mean scores of late-stage graduate students to question 17 of EBMAS

Classes of 2009 -10 - 2011 - 12)……………………………………………………………………………………………………………25

Table 24. Mean scores of late-stage graduate students to question 17 of EBMAS

(Classes of 2009 -10 - 2011 - 12)……………………………………………………………………………………………………………26

5

INTRODUCTION

One component of NYU Steinhardt’s comprehensive system for assessing the

development of its teacher education students and the effectiveness of its teacher education

program is the Education Beliefs and Multicultural Attitudes Scale (EBMAS). NYU recognizes

that the qualities of a competent and caring teacher go beyond knowledge of subject matter

and pedagogical skills that can be tested and observed. Also important are beliefs and attitudes

toward teaching, learners and learning, and cultural communities—or what Burant et.al. (2007)

refer to as teaching dispositions—as well as beliefs in one’s teaching efficacy. EBMAS was

developed by Steinhardt’s Center for Research on Teaching and Learning to measure these

unobservable dispositions using survey research methodology.

EBMAS data were used in NYU’s TEAC Inquiry Brief (Tobias, Pietanza, & McDonald, 2011)

as evidence supporting the claims that its graduates were competent and caring teachers. This

paper presents updated EBMAS findings from the continued assessment of NYU’s teacher

education students during the 2010 – 11 and 2011 – 12 academic years. In addition to

presenting the overall findings for the assessment, the paper describes the results from CRTL’s

continued research on EBMAS, including analysis of its psychometric properties and

investigation of differences in scores for students disaggregated by program, demographics, and

experience.

Following this introduction, the paper provides background and history on the

development of EBMAS. Next is a section on the survey methods followed by the presentation

of results and a discussion of their implications.

BACKGROUND ON EBMAS

CRTL developed EBMAS in fall 2009 as a measure of teacher candidates’ developing

dispositions toward teaching. EBMAS replaced its precursor, the Educational Beliefs

Questionnaire (EBQ), which was administered to Steinhardt teacher-education students from

2004 - 2008. The initial form of EBMAS, which was administered to NYU teacher education

students in fall 2009 and spring 2010 as part of the TEAC re-accreditation self-inquiry study,

consisted of 39 items. In addition to the EBQ, EBMAS items were drawn from the Teacher

Efficacy Scale (TES) (Gibson and Dembo, 1984) and the Teacher Multicultural Attitude Survey

(TMAS) (Ponterotto et al., 1998). Item selection was based on alignment with the goals of the

NYU program and the clarity of the items. It was hypothesized that the original 39-item scale

would measure four constructs: General Teacher Efficacy (GTE), defined as the overall belief

that teaching can promote the learning of all students regardless of home background or

community; Personal Teacher Efficacy (PTE), which is the teacher’s own belief that he or she

can educate all children regardless of background; Multicultural Attitudes (MA), which is the

teachers’ awareness of, comfort with, and sensitivity to issues of cultural pluralism in the

classroom; and Social Justice (SJ), defined as their belief in the moral and social responsibility

6

of teachers to educate all children equitably. However, factor analysis of the data from the

administration of the 39-item survey found that a slightly different factor structure that was

comprised of 28 of the items and explained 48% of the variance. While GTE emerged as a

major factor as expected, the results of the factor analysis differed from expectancy with

respect to the other three factors. First, the PTE items split into two factors: one was labeled

PTE 1, and included items that asked the extent to which aspiring teachers felt capable of

dealing with a variety of classroom situations and pupil problems; the second, labeled PTE 2,

included items that asked the extent to which they felt that the successes of their pupils could

be attributed to their teaching. The differences between these two factors are subtle and had

not been observed in previous research on teacher efficacy, which had focused exclusively on

practicing teachers. Second, the items designed to measure MA and SJ separately loaded on a

single factor, which was labeled MA/SJ. Therefore, four subscale scores were computed using

the 28 items aligned with the empirical factors and were used to assess the claims in the TEAC

Inquiry Brief. In addition, EBMAS was reformatted to a 28-item version, which has been used in

all administrations of the survey subsequent to spring 2010.

CRTL continues to do research on EBMAS, which includes ongoing study of its factor

structure, validity, and reliability. The results of some of this research has led to the further

modification of the subscales, which has changed the scoring and reporting of results in this

paper. The new scoring system will also be described in the 2013 Annual Report to TEAC.

METHOD

Data Collection CRTL attempts to administer EBMAS to all teacher education students, both graduate and

undergraduate, twice, once at the beginning of their first semester at Steinhardt and again near

the end of their last semester. A sample of undergraduates in the dual childhood and early

childhood programs also take EBMAS at mid-preparation, in the beginning of their junior year.

In actual implementation, undergraduates in the mid-preparation administration have varying

levels of accumulated credits, resulting in their division into Early and Middle preparation

groups for comparative analysis research.

The data included in this paper are form EBMAS administrations for the three academic

years 2009 –10 thru 2011 –12. During this period, EBMAS was administered in two formats:

paper-and-pencil and on-line through Survey Gizmo. Since the audience is captive, the return

rate for the former tends to be much higher and, therefore it was the mode of administration

for all semesters except fall 2011. The paper-and-pencil form was administered by instructors

at sessions of the following classes/events:

• Undergraduates: The entry EBMAS was given by instructors of the New Student

Seminar, which was attended by all teacher education students. The exit EBMAS was

administered at seminars embedded in the final term of student teaching. Students

7

who take the mid-preparation EBMAS were assessed in seminars at the beginning of the

first semester of student teaching.

• Graduates: Fast Track MA students took the entry EBMAS during the orientation

sessions at the beginning of the first summer session. Fall new enrollees were assessed

at the beginning of Inquires, the core pedagogical course. The exit EBMAS was given

near the end of the seminar associated with their last student teaching placement.

The on-line form was administered in the same timeframe as the paper-and-pencil form

via email invitations. In order to maximize response rates, email invitations were sent in the

name of faculty, usually program directors, that the students would recognize. Arrangements

for EBMAS administration were collaboratively coordinated with the Director of Clinical

Services, the program directors, and chairs of the arts departments.

Data

The EBMAS items are statements of beliefs that students respond to using a six-point

Likert scale of agreement ranging from (1) Strongly Disagree to (6) Strongly Agree, with the

intermediate categories labeled (2) Moderately Agree (3) Slightly Agree and so on. Item

statements are counterbalanced, with some stated in the positive form and some in the

negative.

Student workers key-entered completed paper-and-pencil survey forms into SPSS files

within two weeks of each administration. On-line survey data were downloaded into SPSS files

within two weeks of survey closing. The data consist of the individual numerical scores for each

EBMAS item and computed mean scores (scale = 1 – 6) for each of the scales. In computing the

scale scores, responses to reverse-coded items were flipped so that high scores always indicated

positive beliefs and attitudes. Each record also contained demographic data and information on

educational experience and academic programs.

Data Analysis

First, descriptive statistics were computed on the demographic and experience

characteristics of the participants, to describe the sample. Next, in order to determine whether

the data continued to support the hypothesized factor structure of the scale, the updated full

database was submitted to principal components factor analysis with varimax rotation (PCA).

Empirical deviations in the rotated factor solution were examined in relation to the theoretical

structure underlying EBMAS. Following this examination, the scales were modified accordingly

and checked for internal consistency reliability using Cronbach’s coefficient alphas. Then,

evidence of substantive validity continued to be explored by comparing the subscale scores of

groups of students at the early, middle, and late stages of their teacher education programs.

Next, in order to continue to assess the TEAC claims for the Annual Report to TEAC,

mean subscale scores were computed for program completers in the Classes of 2011 and 2012

and compared to the program standard of mean equal to or greater than 4.50. Means for

8

question 17, which asks students to the extent to which their teacher education program has

given (will give) them the skills to be an effective teacher, were calculated separately, as a

measure of perceived program effectiveness. In order to assess the stability reliability of the

results across time, ANOVAS were applied to the mean subscale scores with year (2009 – 10,

2010 – 11, and 2011 – 12) as the independent variable.

Finally ANOVAs and T-Tests for independent samples were applied to mean subscale

scores of participants grouped by descriptive and experience variables, including certification

area, Fast Track program, gender, ethnicity, international students, prior teaching experience,

student teaching, and experience teaching minorities.

RESULTS

Description of the Participants The full dataset had data for a total of 1,684 students, which included all teacher

education who had taken EBMAS during the fall 2009 thru fall 2012 semesters. This report

focuses on the 1,450 NYU teacher education students took EBMAS during the three academic

years 2009-10 – 2011-12. As can be seen in Table 1, the plurality (N = 609, 42%) of the total

sample took the survey in 2010 – 11, followed by 2011 - 12 (N = 501, 34.6%). More than half (N

= 787, 54.8%) were graduate students, of whom 257 (32.7%) were in the Fast Track program.

More than four-fifths (82.7%) of the total sample was female and 107 (7.1%) were international

students. Of the 1,381 students who provided usable data on race/ethnicity, nearly three-fifths

(58.1%) identified as White or European American and 23.2% as Asian (see Table 2).

Table 1. N and percent of students taking EBMAS by

academic year

Academic Year N Students % Total Sample

2009 -10

2010 - 11

2011 - 12

Total

Sample

340 23.4

609 42.0

501 34.6

1450

100.0

9

Table 2. Race/ethnicity of total sample and sample with data

Race/Ethnicity N Students % of Total % With Data

Latino

African American

Asian

White/Euro-

American

Multi-ethnic

Total with data

128 8.8 9.3

55 3.8 4.0

320 22.1 23.2

802

55.3

58.1

76 5.2 5.5

1381 95.2 100.0

Note: 69 students did not provide usable data on race/ethnicity

Table 3 displays the distribution of respondents by certification area. Nearly one- quarter

(N = 354, 24.6%) were in the Dual Childhood/Childhood Special Education major. Other majors

with large numbers of respondents were English (184), Math (155), Dual Early Childhood/Early

Childhood Special Education (154), and Foreign Language/TESOL/Bilingual Education. In a later

part of the results sections, scale scores will be disaggregated by certification areas within

degree programs with N’s of at least five.

The survey also asked the participants about the number of credits they had

accumulated in the program and their prior experiences in education. Consistent with CRTL’s

protocol for administering EBMAS, the largest numbers of undergraduate respondents were in

the beginning or later stages of their programs. As can be seen in Table 4, 350 (55.6%) of the

undergraduates had between 0 – 15 credits and a total of 176 (27.9%) had 90 or more; the

latter were grouped together as the Late stage group in the analysis of scale scores by stage of

program, which is presented below. For the graduate students, 429 (55.0%) were in the

beginning of the program with 0 – 15 credits. All of the other graduate respondents were

considered to be in the late stage of their studies.

In response to a question about whether they had prior teaching experience, 1,220

(84.1%) responded yes. When asked to describe this experience, only 12% of the experiences

could be classified as actual teaching and most of this teaching was in foreign countries.

Thirty-eight percent of those reporting their experiences were tutors, 13% were teacher aides

or assistants, 12% cited student teaching, 11% were counselors in camps or after-school

programs, 10% worked in non-formal education programs, such as parks and zoos, and the rest

worked as interns or substitute teachers. In response to a direct question about whether they

had student taught, 41.6% responded yes. Finally, 43% indicated that their teaching or student

teaching experiences included minority students.

10

Table 3. Number and percent of total sample and sample with data in certification areas

Certification Area N

Students

% of

Total

% With Data

Childhood Ed

Dual Childhood/Childhood Special Ed

Dance Ed

Ed Theater*

Foreign Language Ed

Social Studies Ed

Science Ed

Music Ed

Dual Early Childhood/Early Childhood Special

Ed

Foreign Language/TESOL/Bilingual Ed

English Math

TOSEL/Bilingual Ed

Early Childhood Ed

Total

26 1.8 1.8

354 24.4 24.6

23 1.6 1.6

84 5.8 5.8

69 4.8 4.8

73 5.0 5.1

78 5.4 5.4

91 6.3 6.3

154

10.6

10.7

112 7.7 7.8

184 12.7 12.8

155 10.7 10.8

27 1.9 1.9

8 0.6 0.6

1438 99.2 100.0

* Includes dual majors with social studies and English.

Note: The above data do not include 11 students who did not report their certification areas and one

who reported it as Special Education.

11

Table 4. Credits completed within degree groups for EBMAS respondents

Credits Completed Degree

Undergraduate Graduate

0-15 N

% within Degree

350

55.6%

429

55.0%

16-30 N

% within Degree

23

3.7%

99

12.7%

31-45 N

% within Degree

19

3.0%

170

21.8%

46-60 N

% within Degree

32

5.1%

76

9.7%

61-75 N

% within Degree

13

2.1%

76-90 N

% within Degree

16

2.5%

91-105 N

% within Degree

16

2.5%

106-120 N

% within Degree

56

8.9%

120 or

more N

% within Degree

104

16.5%

N

Total % within Degree

629 774

100.0% 100.0%

Note: 41 students did not respond to this question and six gave out of range values, for a total of 47

missing data and excluded from this table.

Validity and Reliability Structural Validity and Reliability: In order to re-examine the empirical evidence for the

clustering of items into subscales for the calculation of scores for specific dispositional

constructs, PCA was applied to the full dataset of 1,684 students, which included the fall 2012

administration that was only used in this analysis. The results were similar to those for the PCA

that was run on the 2009 – 10 sample, which had taken the earlier 39-item version, with one

exception. The current PCA yielded five factors, with the MA/SJ subscale items splitting into

two factors; the split subscale was more consistent with the theoretical logic that guided the

original construction of the scale. That is, the items that were originally intended to measure

MA and those intended to measure SJ split with each showing high loadings on one factor and

12

low loadings on the other. The five factors accounted for 49.7% of the item variance, slightly

more than the earlier PCA, and were better aligned with the intended theoretical structure of

the original scale. The coefficient alphas for the five scales were moderate to large, confirming

their consistency reliability, as follows: PTE1, alpha = .754; PTE2, alpha = .740, GTE, alpha =

.649; MA, alpha = .848; SJ, alpha = 666. The evidence suggests that the five-factor structure has

reasonable empirical validity and reliability and better theoretical validity than the four-factor

structure. Therefore, the items will be clustered into five subscales for EBMAS scoring in this

and future analyses.

Substantive Validity: Stages: The NYU Inquiry Brief for continuing accreditation reported

that the EBMAS subscale scores of students in the later stage of their program were statistically

significantly higher than for those in the early stage, which was considered to be evidence for the

substantive validity of the scale (Tobias, et. al, 2011). In order to continue to assess the

substantive validity of the five subscales of the new EBMAS scoring system, ANOVAS and T-tests

were applied to test for the statistical significance of differences in the mean subscale scores of

groups of students that varied in their stage of program completion. The results, which are

displayed in Table 5 for undergraduate students and Table 6 for graduate students, mostly

support the substantive validity of EBMAS, although with a few exceptions. As can be seen in

Table 5, undergraduate students in the later stages of their programs, i.e. groups

3 and 4, had higher mean scores than those in the earlier stages, i.e. groups 1 and 2, for four of

the five subscales. Note that due to the length of the undergraduate program and the

assessment schedule, which allows for three assessments of some undergraduates,

undergraduates are divided into four stage groups for this analysis. The late-stage group scored

higher than the new group for four of the subscales and higher than the early-stage group for

three; the middle-stage group scored higher than the new group for three subscales and higher

than the early-stage group for two. There were no statistically significant differences between

the new and early-stage groups and for PTE 2, which assesses the extent to which students

believe they are or will be responsible for the academic and behavioral accomplishments of

their students.

As can be seen in Table 6, due to the shorter duration of the graduate program, these

students were divided into two groups, new and late-stage, for this analysis. The results of T-

tests for the significance of differences in mean EBMAS subscale scores between the two groups

were equivocal, as they had been reported in the TEAC Inquiry Brief. Consistent with the TEAC

results, the late-stage graduate students had a statistically significantly higher mean score than

the new students in PTE 1, which measures their belief that they can or will be able to handle

their students’ academic and behavioral problems in the classroom. This finding is theoretically

reasonable, since this program experiences, especially student teaching, are designed to bolster

their teaching skill and confidence. However, as we observed for the undergraduates above

and consistent with the results reported in the TEAC Inquiry Brief, there

were no statistically significant differences in mean PTE 2 scores. The contradictory findings for

PTE 1 and PTE 2 add evidence supporting the fundamental difference between the constructs

measured by these two subscales and suggest that the former can be impacted by pre-service

program experiences, while the latter may not. Disparate findings were also observed in the

13

results for the MA and SJ subscales for the graduate students. Whereas the mean score for the

late-stage group was higher than for the new group on the SJ subscale, the reverse was true for

the MA subscale. In this regard, it should be noted that these are tests for independent samples

and not repeated measures and the mean MA subscale score of the new students

were already quite high.

Table 5. Summary of ANOVA for differences in EBMAS subscale scores among students at

different stages of their programs (Undergraduates)

Subscale

Stage of

Program

N

Mean

Std.

Deviation

F

Sig

Stages with

significant

differences

PTE1

New (1) 350 3.53 0.85

68.84

0.000

1 & 2 < 3 & 4

Early (2) 41 3.74 0.90

Middle (3) 61 4.40 0.89

Late (4) 176 4.59 0.80

PTE2

New (1) 350 4.28 0.71

0.75

0.524

None

Early (2) 42 4.15 0.68

Middle (3) 61 4.33 0.68

Late (4) 175 4.23 0.73

GTE

New (1) 350 4.80 0.79

5.42

0.001

1 < 4

Early (2) 42 4.86 0.95

Middle (3) 61 5.11 0.92

Late (4) 176 5.07 0.81

MA

New (1) 350 5.01 0.69

28.35

0.000

1 & 2 < 3 & 4

Early (2) 42 5.14 0.71

Middle (3) 61 5.51 0.53

Late (4) 176 5.52 0.59

SJ

New (1) 350 4.96 0.60

24.34

0.000

1 < 3&4;

2 < 4

Early (2) 42 5.05 0.60

Middle (3) 61 5.35 0.52

Late (4) 176 5.37 0.54

14

Table 6. Summary of T-test for differences in EBMAS subscale scores between new and late stage

students (graduate students)

Subscale Stage N Mean Std. Dev. M diff. T Df Signif.

PTE1 Late 343 4.38 0.75

0.51

8.69

770

0.000 New 429 3.87 0.82

PTE2 Late 342 4.42 0.68

-0.01

-0.12

767

0.904 New 427 4.42 0.68

GTE Late 343 4.84 0.92

-0.08

-1.31

697

0.191 New 429 4.92 0.83

MA Late 342 5.36 0.63

-0.12

-2.69

683

0.007 New 429 5.48 0.55

SJ Late 342 5.29 0.62

0.09

2.06

769

0.040 New 429 5.20 0.56

Stability Reliability: In order to assess the stability of EBMAS subscale scores over time,

ANOVAs were applied to the differences in the mean subscale scores across the three years of

the study (see Table 8). This analysis only included students in the late-stage of the program.

In addition to testing for the main effects of year, the analyses tested for the main effects of

degree program and the interaction effects of year and degree. As can be seen in Table 8, there

were no statistically significant interaction effect or main effect for year. There was a

statistically significant main effect for degree program for four of the five subscales, but this does

not detract from the evidence supporting the stability of the findings over time.

Inspection of the total means across the three years in Table 8 reveals that the mean scores of

undergraduates were significantly higher than those for graduate students for three of the

subscales, PTE 1, GTE, and MA, while the mean for graduate students was significantly higher

for PTE 2.

Table 7. Summary of ANOVA for tests of significance of the main and interaction effects of

year and degree on EBMAS subscale scores (late stage BS and MA students in Classes of

2010 - 2012)

Subscale

Effects *

Year Degree Year by Degree

Df F Sig Df F Sig Df F Sig

PTE1

PTE2

GTE

MA

SJ

2 & 513 2.37 0.094

2 & 511 0.59 0.601

2 & 513 1.36 0.258

2 & 512 1.70 0.183

2 & 512 0.48 0.619

1 & 513 8.85 0.003

1 & 511 4.23 0.003

1 & 513 8.20 0.004

1 & 512 8.20 0.004

1 & 512 2.68 0.102

2 & 513 1.55 0.213

2 & 511 0.39 0.674

2 & 513 2.27 0.104

2 & 512 2.38 0.094

2 & 512 2.09 0.125

* Effects (F, sig.) in bold font are statistically significant at p < .05

15

Assessment of TEAC Claims NYU uses the EBMAS as one of its measures of two of its four TEAC claims—Claim 3,

Clinical Competence and Claim 4, Caring Professional—and one of the three cross-cutting

themes (CCT), CCT 2, Multicultural Perspective. The program standard established by the

faculty for attainment of the claims is a mean score for late-stage students of at least 4.50 on the

subscales aligned with each claim. Table 8 displays the results of the assessment of the claims

using EBMAS for the three academic years, 2009-10 thru 2011-12, the first of which is a re-

analysis of the data that were reported in the TEAC Inquiry Brief for reaccreditation. Consistent

with the high stability reliability of this measure reported above, the results show high

consistency across the three years. For undergraduates, participants met the program standard

all three years in four of the five subscales, PTE 1 (Clam 3), GTE (Claim 4), MA (CCT 2), and SJ

(Claim 4). On the other hand, undergraduates fell below the program standard in PTE 2 (Claim

3) by about one-quarter point for all three years. This is further evidence of the fundamental

difference between these two types of PTE and suggests that although undergraduates are

confident they know how to help their students learn and behave they are less sure that the

successes of their students can be attributed to their teaching. Table 8 shows similar positive

results for graduate students on the Claim 4 measures, GTE and SJ, and the CCT

2 measure, MA, but somewhat different outcomes on the Claim 3 measures. Although

graduate students fell below the program standard for both PTE 1 and PTE 2 across the three

years combined, the shortfall was only about a tenth of a point overall and they did meet the

standard in PTE 1 in 2011-12. Moreover, the mean scores for the two scores have been

increasing across the three years. Thus, the overall findings continue to provide evidence

supporting the claims.

16

Table 8. Mean EBMAS subscale scores by degree and year compared to the program standard

of 4.50 (Classes of 2010 - 2012)

Subscale/

Claims **

Year

Undergraduate Graduate

N

Mean Std.

Dev. M -

4.50 *

N

Mean Std.

Dev. M -

4.50 *

PTE1

Claim 3

2009 -10

2010 - 11

2011 - 12

54

54

68

4.52

4.70

4.55

0.84

0.66

0.88

0.02

0.20

0.05

109

114

120

4.22

4.40

4.51

0.76

0.80

0.67

-0.28

-0.10

0.01

Total 176 4.59 0.80 0.09 343 4.38 0.75 -0.12

PTE2

Claim 3

2009 -10

2010 - 11

2011 - 12

53

54

68

4.18

4.25

4.24

0.78

0.76

0.67

-0.32

-0.25

-0.26

109

114

119

4.40

4.36

4.49

0.70

0.70

0.64

-0.10

-0.14

-0.01

Total 175 4.23 0.73 -0.27 342 4.42 0.68 -0.08

GTE

Claim 4

2009 -10

2010 - 11

2011 - 12

54

54

68

5.25

4.94

5.02

0.67

0.90

0.83

0.75

0.44

0.52

109

114

120

4.78

4.77

4.96

0.93

0.95

0.88

0.28

0.27

0.46

Total 176 5.07 0.81 0.57 343 4.84 0.92 0.34

MA

CCT

2

2009 -10

2010 - 11

2011 - 12

54

54

68

5.52

5.54

5.50

0.51

0.57

0.66

1.02

1.04

1.00

108

114

120

5.22

5.35

5.50

0.67

0.68

0.49

0.72

0.85

1.00

Total 176 5.52 0.59 1.02 342 5.36 0.63 0.86

SJ

Claim 4

2009 -10

2010 - 11

2011 - 12

54

54

68

5.41

5.37

5.34

0.49

0.51

0.61

0.91

0.87

0.84

109

114

119

5.19

5.28

5.39

0.60

0.70

0.52

0.69

0.78

0.89

Total 176 5.37 0.54 0.87 342 5.29 0.62 0.79

* Values in bold font indicate the program standard has been met or exceeded.

** TEAC Claims: Claim 3, Clinical Competence; Claim 4, Caring Professional; CCT 2, Multicultural

Perspective

Differences in Scale Scores by Student Demographics and Experience A series of statistical analyses were performed to determine whether EBMAS scales

varied for groups of students who differed in key measured program, experiential, and

demographic variables. In order to control for stage in the program and degree, the participants

were late-stage students and separate analyses were performed for undergraduate and graduate

students.

Undergraduates: Table 9 summarizes the statistical analyses performed on the EBMAS

subtest scores of groups of undergraduates varying in descriptive characteristics. As the bold

font indicates, statistically significant differences in at least one of the five subscales were

observed for four of the six measured descriptive characteristics as follows: PTE 1 for

certification area; PTE 2 an GTE for race/ethnicity; PTE 1 for student teaching; and four

17

subscales, PTE 1, GTE, MA, and SJ, for experience teaching/student teaching minorities. No

significant differences were observed for gender or international versus American students.

Further analyses were performed to determine the nature and size of these statistically

significant differences.

Table 9. Summary of t-tests and ANOVAs for test of significance of differences in EBMAS subscale scores

by descriptive characteristics of late-stage undergraduate student teachers (classes of 2010 - 12)

Descriptor

EBMAS subscales

PTE 1 PTE 2 GTE MA SJ

t /F Sig t /F Sig t /F Sig t /F Sig t /F Sig

Certification area

Gender

Race/ethnicity

International student

Student teaching

Taught minorities

5.13 0.000

-0.61 0.542

2.32 0.060

-0.60 0.584

2.91 0.004

2.52 0.013

0.35 0.909

-0.24 0.815

2.71 0.032

-0.06 0.96

0.10 0.918

0.03 0.974

0.33 0.922

-1.57 0.119

4.87 0.001

-0.89 0.375

0.69 0.489

2.93 0.011

1.35 0.236

0.28 0.777

1.25 0.293

-0.81 0.417

0.33 0.740

2.73 0.011

0.698 0.652

-0.05 0.757

1.02 0.398

-0.87 0.386

0.31 0.759

1.97 0.051


First, Scheffe post-hoc comparisons among pairs of PTE 1 means were performed for

seven certification areas with a minimum N of 5. Table 10 displays the means in rank order

form low to high and in homogeneous subsets; that is, means in the same subset do not differ

significantly but means in one subset differ significantly from means not in that same subset.

Accordingly, the mean PTE 1 scores for undergraduates in math music are significantly lower

than the mean for dual early/childhood/ early childhood special education. No other

differences between certification areas were statistically significant.

Table 10. Homogeneous subsets of EBMAS PTE 1 means by certification area for late-

stage undergraduates (Classes of 2010 - 12)

Certification area

N

Subset for alpha = 0.05

1 2 math

music

science ed

English ed

dual childhood/childhood special ed

ed theater

dual early childhood/early childhood special ed

11

17

5

23

50

7

54

3.85

4.04

4.12

4.61

4.63

4.69

4.12

4.61

4.63

4.69

4.89

Means for groups in homogeneous subsets are displayed.

a. Uses Harmonic Mean Sample Size = 12.183.

b. The group sizes are unequal with a minimum N of 5. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed.

c. Means in bold font for certification areas in subset 1 are statistically significantly smaller than the means in bold font for certification areas in subset 2.

18

Tables 11 and 12 display the results of the respective Scheffe post-hoc comparisons of

PTE 2 and GTE means between racial/ethnic groups. For PTE 2, the mean for African Americans

was significantly lower than the means for Latinos and Whites/European-Americans. For GTE,

the mean for Asian undergraduates was significantly lower than the mean for Latinos.

Table 11. Homogeneous subsets of EBMAS PTE 2

means by race/ethnicity for late-stage undergraduates

(Classes of 2010 - 12)

Race/Ethnicity

N

Subset for alpha =

0.05

1 2

African American

Asian

Multi-racial

Latino

White/Euro-American

8

30

15

30

82

3.84

3.95

4.25

3.95

4.25

4.35

4.35



b. The group sizes are unequal. The harmonic mean of the group sizes is used. Type I error levels are not guaranteed.


Table 12. Homogeneous subsets of EBMAS GTE means by

race/ethnicity for late-stage undergraduates (Classes of 2010 - 12)

Race/Ethnicity

N


1 2 Asian

White/Euro-American

Multi-racial

African American

Latino

30

82

15

8

30

4.63

5.12

5.23

5.34

5.12

5.23

5.34

5.38





Table 13 displays the means for the EBMAS subscales that had statistically significant

differences on t-tests for independent samples comparing students who had and had not

19

student taught and those who had and had not taught or student taught substantial numbers of

minority students. It should be remembered that most of the “teaching experiences” of the

Table 13. Mean EBMAS subscale scores for late-stage undergraduate students

with varying types of student teaching experience (Classes of 2010 - 2012) *

Experience Subscale Yes/No N Mean SD

Have you student taught?

PTE1 Yes 158 4.64 0.80

No 16 4.04 0.69

Have you taught minority students before?

PTE1

Yes 148 4.63 0.82

No 23 4.18 0.57

GTE

Yes 148 5.14 0.80

No 23 4.61 0.80

MA

Yes 148 5.57 0.56

No 23 5.16 0.69

SJ

Yes 148 5.41 0.54

No 23 5.17 0.57

* Data are displayed only for descriptors and scales that showed statistically

significant mean differences (see Table X).

respondents involved assisting teachers, after-school programs, or non-formal schooling (see

above). As can be seen in Table 13, the mean PTE 1 score for undergraduates who had student

taught was significantly higher than for those who had not. This makes theoretical sense, since

the students who had student taught would have had the opportunity to test their personal

teaching skill in practice. In addition, students who had experience teaching/student teaching

minorities had significantly higher PTE 1, GTE, MA, and SJ scores than those who had not.

These findings support the program’s theory and practice of providing student teachers field

opportunities in high-minority schools.

Graduate Students: Table 14 summarizes the statistical analyses performed on the

EBMAS subtest scores of groups of graduate students varying in descriptive characteristics.

Statistically significant differences in at least one of the five subscales were observed for all but

one (taught minorities) of the seven measured descriptive characteristics as follows: MA for

Fast Track; GTE and MA for certification area; PTE 2 and MA for gender; PTE 1, an GTE, and SJ

for race/ethnicity; GTE for international students; and PTE 1, GTE, and SJ for student teaching.

First, a T-test for independent samples revealed that the mean MA scores of students in

the Fast Track program was significantly lower than the mean for those in the regular program,

M = 5.31, SD = 0.64, N = 0.64 for the former verses M = 5.45, SD = 0.61, N = 394 for the latter.

This was the only significant difference observed in the EBMAS scores of Fast Track students.

Next, Tables 15 and 16 display the results of Scheffe post-hoc comparisons of mean GTE and

MA scores, respectively, between pairs of certification areas. As can be seen in Table 15, the

mean GTE scores of graduate students in the TOSEL/bilingual areas was significantly lower than

20

students in social studies education; and, as indicated in Table 16, the mean MA scores of

students dance education and mathematics were significantly lower than the means for dual

early childhood/early childhood special education, dual childhood/childhood special education,

TOSEL/bilingual education, and foreign language education.

Table 14. Summary of t-tests and ANOVAs for test of significance of differences in EBMAS subscale scores by

descriptive characteristics of late-stage graduate student teachers (classes of 2010 - 12)

Descriptor

EBMAS Subscales

PTE 1 PTE 2 GTE MA SJ

t/F Sig t/F Sig t/F Sig t/F Sig t/F Sig

Fast Track

Certification area

Gender

Race/ethnicity

International student

Student teaching

Taught minorities

-1.86 0.063

1.43 0.152

0.01 0.989

2.97 0.020

0.92 0.359

4.18 0.000

1.57 0.118

1.22 0.222

0.73 0.719

-2.00 0.047

0.11 0.979

-0.81 0.418

-0.73 0.464

-0.72 0.471

1.01 0.314

2.63 0.002

-1.29 0.197

16.42 0.000

-5.22 0.000

3.35 0.001

1.51 0.132

-2.22 0.027

4.07 0.000

-2.75 0.008

1.91 0.109

-0.85 0.395

0.41 0.681

-0.29 0.774

0.51 0.609

1.64 0.080

-1.02 0.308

3.61 0.007

-2.51 0.017

2.06 0.040

1.14 0.256


Table 15. Homogeneous subsets of EBMAS GTE means by certification area for

late-stage graduate students (Classes of 2010 - 12)

Certification area

N


1 2 TOSEL/bilingual ed

music ed

dance ed

foreign language/TESOL/Bilingual ed

Math ed

science ed

English ed

foreign language ed

dual childhood/childhood special ed

childhood ed

dual early childhood/early childhood special ed

ed theater

social studies ed

12

14

6

51

56

24

45

15

62

11

16

14

13

4.33

4.43

4.46

4.59

4.64

4.79

4.90

5.00

5.09

5.09

5.14

5.25

4.46

4.59

4.64

4.79

4.90

5.00

5.09

5.09

5.14

5.25

5.42





21

Table 16. Homogeneous subsets of EBMAS MA means by certification area for

late-stage graduate students (Classes of 2010 - 12)

Certification area

N


1 2 dance ed

Math ed

Music ed

childhood ed

science ed

English ed

ed theater


social studies ed

dual early childhood/early childhood special

dual childhood/childhood special

TOSEL/Bilingual Ed

foreign language ed

6

55

14

11

24

45

14

51

13

16

62

12

15

4.85

4.96

5.07

5.18

5.30

5.39

5.44

5.46

5.53

5.07

5.18

5.30

5.39

5.44

5.46

5.53

5.55

5.56

5.56

5.63

See footnotes in Table 15 above

Next Scheffe post-hoc comparisons were applied to all pairs of race/ethnic group means

for PTE 1 (Table 17), GTE (Table 18), and SJ (Table 19). The results show that the PTE 1 mean

for the multi-racial group was significantly lower than Latinos, the GTE mean for Asian students

was significantly lower than all other groups except multi-racial, and the SJ mean for Asians was

significantly lower than Latinos. The consistently lower EBMA scores of Asian graduate

students warrant discussion and further exploration.

Table 17. Homogeneous subsets of EBMAS PTE 1 means by

race/ethnicity for late-stage graduate students (Classes of

2010 - 12)

Race/Ethnicity

N Subset for alpha = 0.05

1 2

Multi

Asian

White/Euro-American

African American

Latino

16

73

204

14

16

3.83

4.32

4.42

4.46

4.32

4.42

4.46

4.63




c. Means in bold font for certification areas in subset 1 are statistically significantly

smaller than the means in bold font for certification areas in subset 2.

22

Table 18. Homogeneous subsets of EBMAS GTE means by

race/ethnicity for late-stage graduate students (Classes of 2010 -

12)

Ethnicity

N


1 2 Asian

Multi

White/Euro-American

Latino

African American

73

16

204

16

14

4.17

4.80

4.80

5.03

5.26

5.30




c. Means in bold font for certification areas in subset 1 are statistically significantly

smaller than the means in bold font for certification areas in subset 2.

Table 19. Homogeneous subsets of EBMAS SJ means by

race/ethnicity for late-stage graduate students (Classes of

2010 - 12)

Race/Ethnicity

N

Subset for alpha =

0.05

1 2 Asian

Multi-racial African

American

White/Euro-American

Latino

73

16

14

203

16

5.12

5.19

5.23

5.37

5.19

5.23

5.37

5.58





Last, Table 20 summarizes the mean scores for subscales that showed statistically

significant T-tests for independent samples on three dichotomous descriptors. As can be seen

in the table, the mean PTE 1 score of females was significantly higher than males; graduate

students who had student taught had significantly higher mean PTE 1, GTE, and SJ scores than

23

Table 20. Mean EBMAS subscale scores for late-stage graduate

students by significant descriptive characteristics (Classes of 2010 -

2012) *

Descriptor Subscale Values N Mean SD

Gender

PTE2 male 60 4.26 0.74

female 282 4.45 0.66

Have you student taught?

PTE1

Yes 254 4.47 0.71

No 81 4.09 0.77

GTE

Yes 254 4.93 0.90

No 81 4.55 0.92

SJ

Yes 253 5.33 0.61

No 81 5.18 0.54

Are you an international student?

GTE

yes 31 4.05 0.81

no 312 4.92 0.89

SJ

yes 31 4.93 0.85

no 311 5.33 0.58

* Data are displayed only for descriptors and scales that showed

statistically significant mean differences (see Table X).

those who had not; and the mean SJ scores of international students was significantly lower

than their American counterparts. The last finding may be related to the consistently lower

scores found for Asian students, as described above.

Perceived Effectiveness of the Steinhardt Teacher Education Program

One of the EBMAS questions (Question 17) directly asks students about the effectiveness

of their teacher education program. The question is posed as a statement, “My teacher training

program and/or experience has given me the necessary skills to be an effective teacher.” As a

direct measure of the perceived quality of the program, it warrants separate analysis.

First, Tables 21 and 22 show summaries of ANOVAs comparing the mean program ratings

of late-stage undergraduate and graduate students, respectively, across the three years. There

are no statistically significant differences between years for the undergraduates and the ratings

are consistently high, five or above each year. On the other hand there were

statistically significant differences between years for the graduate students with the mean for

2009-10 significantly lower than 2011-12. Moreover, a t-test for independent samples found

that the overall mean for undergraduates was significantly higher than the graduates’ mean, M

= 5.17 (SD = .96) for the former versus M = 5.17 (SD = 1.07) for the latter, t = 3.54, df = 514, p =

.000. On the positive side, the mean for graduate students has been increasing over the three

years.

24

Table 21. Summary of ANOVA comparing mean scores of late-

stage undergraduates to question 17 for the three Classes of

2009-10 - 2011 - 12 *

Year

N

Mean

SD

F

Sig.

2009 -10 53 5.34 0.76

2.26

0.108

2010 - 11 54 5.24 0.85

2011 - 12 68 4.99 1.15

Total 175 5.17 0.96

* Question 17: My teacher training program and/or experience has given me the necessary skills to be an effective teacher.

Table 22. Summary of ANOVA comparing mean scores of late-stage graduate

students to question 17 for the three Classes of 2009-10 - 2011 - 12 *

Year

N

Mean

SD

F

Sig.

Sig. Differences

2009 -10 109 4.60 1.25

4.15

0.017

2009-10 < 2011-12

2010 - 11 113 4.88 1.08

2011 - 12 119 4.99 0.84

Total 341 4.83 1.07

* See foot note for Table 21 above

Although there were no statistically significant differences in mean ratings between the

students grouped by certification areas, these data are displayed for information and discussion

in Table 23 for undergraduates and Table 24 for graduate students

Table 23. Mean scores of late-stage graduate students to question

17 of EBMAS (Classes of 2009 -10 - 2011 - 12) *

Certification area

N

Means **

math

music


ed theater

science ed

English


11

16

50

7

5

23

54

4.45

5.13

5.14

5.14

5.20

5.26

5.43


** No statistically significant differences between means at p< .05

25

Table 24. Mean scores of late-stage graduate students to question 17

of EBMAS (Classes of 2009 -10 - 2011 - 12) *

Certification area

N

Mean **

math



English

childhood ed dance

ed TOSEL/Bilingual

Ed foreign language

ed


ed theater

social studies ed

music

science ed

55

50

62

45

11

6

12

15

16

14

13

14

24

4.55

4.58

4.76

4.78

4.82

4.83

4.92

4.93

4.94

5.14

5.15

5.29

5.33


** No statistically significant differences between means at p< .05

Finally, there were no statistically significant differences in mean ratings of program

effectiveness for the descriptors Fast Track, gender, student teaching, previous teaching, and

taught minorities. However among graduate students, international students had a more

positive perception of the program’s effectiveness than American students. The mean for

international students was 5.17 (SD = 0.75) and the mean for American students was 4.78 (SD =

1.10), with the difference statistically significant (T = 2.47, df = 42.1, p = .018). It is interesting

that although the graduate international students had lower EBMAS subscale scores than the

American students, they had a more positive perception of the effectiveness of the program.

SUMMARY AND CONCLUSIONS

This report presented updated findings from the analysis of EBMAS, a component of

NYU Steinhardt’s assessment student and program assessment system, for the three academic

years 2009-10 thru 2011-12. During that time, EBMAS was administered to 1,450

undergraduate and graduate students who were at the beginning, middle, or end of their pre-

service teacher-education programs. The report presented findings from continued research

on EBMAS, results from the use of the scale to assess TEAC program clams, and analyses of the

differences in scores for students grouped by demographic, experience, and program

characteristics. These findings update the results reported for a smaller dataset in NYU’s 2011

TEAC Inquiry Brief for re-accreditation. The key findings are as

follows:

26

1. A new PCA of the updated dataset largely replicated the factor structure that

emerged from the PCA of the TEAC dataset, but with one important difference. The

subscale MA/SJ based on the earlier PCA split into two factors, which led to a new

scoring system using five subscales: PTE 1, PTE 2, GTE, MA, and SJ. The new factor

and subscale structure is more consistent with the theory underlying the

development of EBMAS than the previous four subscale structure.

2. The substantive validity of EBMAS was strengthened by new evidence that late-

stage students had higher scores than new students and additional evidence was

found supporting the scale’s internal consistency reliability and stability. Therefore,

inferences about student dispositional development and program effectiveness

based on EBMAS can be made with confidence.

3. Mean subscale scores for late-stage undergraduate and graduate students continued

to meet and exceed the TEAC program standards for Claim 4, Caring Professionals,

and the Cross-Cutting Theme for Multi-cultural perspective; however the standards

for Claim 3, Clinical Competence, were only partially met for undergraduates and

weakly supported for graduate students. The mean scores of undergraduates were

significantly higher than those for graduate students for three of the subscales, PTE

1, GTE, and MA, while the mean for graduate students was significantly higher for

PTE 2. These findings are largely consistent with the 2011

TEAC Inquiry Brief, although the scores of graduate students appear to be increasing

over the three years.

4. There were several noteworthy differences in the scores of students grouped by

demographic, experience, and program variables. For undergraduates, students in

the Dual Early Childhood/Early Childhood Special Education certification area had

higher mean PTE 1 scores than those in Math Education and Music Education; Latino

and White/European-American students had higher PTE 2 scores than African

American students and Latino students had higher GTE scores than Asian students;

and students who had student taught had higher PTE 1 scores than those who did not

and those who taught/student taught minority pupils not only had higher PTE 1

scores, but also had higher GTE, MA, ad SJ scores. Among graduate students, those

in the Fast Track program had lower mean MA scores than those in the regular

program; students in Social Studies Education had a higher mean GTE score than

those in TOSEL/Bilingual Education and Music Education, while those in Foreign

Language Education, TOSEL/Bilingual Education, Dual Childhood/Childhood Special,

and Dual Early Childhood/Early Childhood Special had higher mean MA scores than

those in Dance Education and Math Education; Latino students had higher PTE 1

scores than Multi-racial students and higher GTE and SJ scores than Asian students,

while White/European-American and African-American students also had higher GTE

scores than Asian students; female students had higher PTE 2 scores than males,

those who student taught had higher PTE 1, GTE, and SJ scores than those who did

not, and international students had lower GTE and SJ scores than American

students.

27

5. Overall, students gave very high ratings to their teacher education program in terms

of giving them the necessary skills to be an effective teacher. Undergraduates gave

significantly higher ratings to their program than graduate students, although the

mean rating for graduate students in the most recent year, 2011-12, was significantly

higher than in 2009-10. There were no statistically significant differences in these

ratings for descriptive variables, with the exception of a significantly higher mean

rating for international graduate students than American students, despite the

former’s generally lower EBMAS scale scores.

These findings lead to the overall conclusion that EBMAS has been a valid and reliable

tool for assessing the developing teaching dispositions of NYU Steinhardt teacher education

students and, consequently, the data have important implications for readiness to teach of the

graduates and the effectiveness of the program in preparing competent and caring educators.

NYU graduates generally have strong beliefs in the general efficacy of teaching to promote the

learning and positive of all pupils, value social justice, and a strong awareness of and positive

attitude toward the importance of a multicultural perspective. They also have moderate

confidence in their personal efficacy to teach all students, although with less than certainty than

their other beliefs. The exploration of differences in scores between students grouped by

demographic, experience, and program characteristics revealed some differences that warrant

discussion among program faculty and administrators. Finally, recent differences in the factor

structure of the scale that emerged from PCA highlight the importance of continuing research

on its psychometric properties.

REFERENCES CITED

Burant, T.J., Chubbuck, S.M., &Whipp, J.L. (2007).Reclaiming the moral in the dispositions

debate. Journal of Teacher Education, 58(5), 397-411.

Gibson, S., & Dembo, M. H. (1984). Teacher efficacy: A construct validation. Journal of

Educational Psychology, 76(4), 569-582.

Ponterotto, J.G., Baluch, S., Greig, T., and Rivera, L. (1998) . Development and initial score

validation of the teacher multicultural attitude survey. Educational and Psychological

Measurement, 58(6), 1002-1016.

Tobias. R., Pietanza, R., & McDonald, J. TEAC Inquiry Brief. Submission to the Teacher

Education Accreditation Council, September 2011.

28

1

Appendix E:Inventory: Status of evidence from measures and indicators for TEAC Quality Principle I

Type of Evidence Available and in the Brief Not Available and Not in the Brief

Note: items under each

category are examples.

Program may have more or

different evidence

Relied on Reasons for including the results

in the Brief

(Location in Annual Report)

Not relied on Reasons for not relying

on this evidence

For future use

Reasons for including in future

Briefs

Not for future use

Reasons for not including in future Briefs

Grades

1. Student grades and grade

point averages

Content Knowledge GPA,

Pedagogical Knowledge GPA,

Clinical Skills GPA, and Cross-

Cutting Theses GPA are valid

and reliable measures of student

mastery of the skills and

knowledge that are associated

with the claims. (pp 14-15)

Scores on standardized tests

2. Student scores on

standardized license or board

examinations

Scaled scores on the NYSTCE

Content Specialty Tests and

Assessment of Teaching Skills-

Written exams are valid, reliable,

and sensitive measures of

Content Knowledge and

Pedagogical Knowledge, while

scaled scores on the Liberal Arts

and Sciences Test are valid

measures of the cross-cutting

theme of Learning-to-Learn,

which requires a broad and deep

understanding of the tools and

concepts of the liberal arts and

sciences. (pp 9-12)

2

3. Student scores on

undergraduate and/or graduate

admission tests of subject

matter knowledge and aptitude

NYU’s claim of Content Knowledge

pertains to the knowledge of program

completers. Faculty believes that

admissions tests for undergraduates taken

four or more years prior to graduation are

not valid measures of the claim because

they are distal in time and not well aligned

with the constructs in content. Admissions

tests are optional for graduate admissions

and few students submit them.

4.Standardized scores and gains

of the program graduates’ own

students

In its Brief, NYU used the VAM

test score gains of the pupils of

graduates teaching in grades 4-8 in

the NYC public schools to measure

Clinical Competence. Recently, the

NYC Department of Education

(NYCDOE) discontinued the

calculation of VAM measures and

transitioned to the use of Growth

Percentile Measures (GPM), which

are used by the NYS Education

Department as part of its new teacher

evaluation system. This system has

been the focus of political and

collective bargaining and we are in

negotiations with NYCDOE to

obtain release of the data for our

graduates. We expect a successful

conclusion to these negotiations and

anticipate receiving these data in

time for the next TEAC Annual

Report in 2014.

3

Ratings

5. Ratings of portfolios of

academic and clinical

accomplishment

Portfolio data were not included in

the original Brief because of

concerns about logistics, cost, and

low reliability of the measures.

Recently, there have been advances

in portfolio technology and increased

interest as part of the institution of a

new evaluation system for

prospective teachers by NYS. We

are conducting due diligence of the

new systems and plan on piloting

some for possible adoption. We

anticipate that these data will be

available for the next Brief.

6. Third-party rating of

program's students

NYU considered using third-party

ratings of program students but

determined the procedures to be not

feasible logistically. However, the

faculty considers this to be valuable

additional evidence and will attempt

to design feasible methods in the

future.

4

7. Ratings of in-service,

clinical, and PDS teaching

An important measure used to

assess all four claims and the

cross-cutting theme of Learning-

to-Learn is the DRSTOS-R.

This observation protocol is used

by field supervisors to assess the

developing pedagogical

proficiency of student teachers in

clinical practice. Evidence of

empirical validity and reliability

is presented in the Brief. (pp. 8-

9)

NYU believes that in-service

ratings of the teaching of its

graduates can provide useful data for

reflecting back upon the quality of

graduates’ program preparation. As

part of the institution of a new

teacher evaluation system in NYS,

all teachers will receive effectiveness

ratings. NYU plans on obtaining

these ratings for its graduates when

the new system takes effect in 2014.

The new state evaluation system will

also rate pre-service teachers using

the edTPA. NYU plans on using

these ratings to supplement or

replace the DRSTOS-R data.

8. Ratings by cooperating

teacher and college/

university supervisors, of

practice teachers' work samples

Student teachers’ work samples

are used as an important source

of evidence for DRSTOS-R

assessments. The work samples

include journals, lesson plans,

written reflections on practice,

and pupil work. Field

supervisors review the work

samples and then use them

holistically to arrive at the

ratings of related DRSTOS-R

items. This evidence is cited in

the protocols completed by the

field supervisors. (pp. 8-9)

5

Rates

9. Rates of completion of

courses and program

The faculty believes these data are not valid

measures of the claims and, therefore, they

are not included in the Brief.

10. Graduates' career retention

rates

NYU continues to obtain data

from its Graduate Tracking

Study to compute three-year

retention rates for graduates

teaching in the NYC public

schools. These data are reliable

and valid for assessing the claim

that graduates are Caring

Professionals who have the

commitment and skill to sustain

their careers in inner-city

schools. (pp. 18-20)

11. Graduates' job placement

rates

Job placement rates will not be used in

future Briefs to support the claims, since

they are subject to the vicissitudes of the job

market. Accordingly, they are used by

faculty for information purposes, but not

tested against any program standard.

12. Rates of graduates'

professional advanced study

NYU has been collecting these data

in its Program Exit Surveys since

2009. Faculty believes additional

data from future surveys will be

needed in order to generate reliable

estimates of rates of professional

advanced study.

6


leadership roles

NYU will be collecting these data in

a planned Five-Year Follow-Up

Survey and they will appear in future

reports.


professional service activities




reports.

Case studies and alumni competence

15. Evaluations of graduates by

their own pupils

NYU believes that the questionable

reliability and validity of these data render

the high resource expenditures required to

collect them unwarranted.

16. Alumni self-assessment of

their accomplishments




reports.

17. Third-party professional

recognition of graduates (e.g.

NPTS)




reports.

18. Employers' evaluations of

the program's graduates

Principals’ ratings of all teachers

will be part of the new NYS teacher

evaluation system. NYU plans to

obtain these data for its graduates

and use them in future studies.

19. Graduates' authoring of

textbooks, curriculum

materials, etc.




reports.

7

20. Case studies of graduates’

own pupils’ learning and

accomplishment

NYU believes the cost of collecting these

data would be excessive and the inferences

that might be drawn from them concerning

graduates’ effectiveness would have weak

validity.

Other Data

21. Students’ self-ratings of

growth during student teaching.

NYU uses the ETFQ to assess

student teachers’ perceptions of

growth in Content Knowledge,

Pedagogical Knowledge, and

Clinical Skills. The results of

this assessment have theoretical

validity and have been consistent

across many cohorts. (pp. 12-13)

22. Students’ dispositions to

teaching.

NYU has developed EBMAS, a

survey that assesses students’

self perceptions of general

teaching efficacy, personal

teaching efficacy, social justice,

and multicultural attitudes.

EBMAS has demonstrated

empirical validity and internal

consistency reliability for

measuring these dispositions

which research has linked to

teacher quality. (pp. 13-14)

8

23. Graduates ratings of the

their preparation for teaching NYU conducts two surveys of

teacher-education program

graduates: the Program Exit

Survey and the One-Year

Follow-Up Survey. These

surveys assess the extent to

which graduates feel that the

program has prepared them to be

successful teachers. The surveys

show consistency of results for

successive administrations,

convergence of findings between

the two surveys, and consistency

with the results from a source

survey developed by Arthur

Levine. In addition, the items

are well aligned with NYU’s

claims. (pp. 15-18)

24. Demographics of

graduates’ schools of

employment

Through its electronic graduate

tracking study, NYU assesses the

demographic characteristics of

the NYC public schools in which

graduates are employed. These

data are used to assess the

graduates’ commitment to

working in inner-city schools,

which is aligned with the claim

of Caring Professionals (pp.18-

19)

Documents

Annual Report to the Teacher Education Accreditation ...research.steinhardt.nyu.edu/scmsAdmin/...FINAL DRAFT June 10, 2013 Department of Teaching and Learning The Steinhardt School