59
Why it’s time we stopped managing schools like baseball teams …for the most part You can view this presentation at slideshare: http :// www.slideshare.net/NWEA/schools-can John Cronin, Ph.D. Senior Director, the Kingsbury Center at Northwest Evaluation Association

Naesp keynote3

Embed Size (px)

DESCRIPTION

 

Citation preview

  • 1. Why its time we stopped managing schools like baseball teams part for the most John Cronin, Ph.D. Senior Director, the Kingsbury Center at Northwest Evaluation AssociationYou can view this presentation at slideshare: http://www.slideshare.net/NWEA/schools-cant

2. How does it work in baseball?In baseball, the contribution of players to the success of the team can be measured (value-added). In baseball, general managers have complete control over the acquisition and deployment of players. 3. How does it work in baseball?Sabermetricians estimate the number of wins a player contributes to his team. Its calculated by estimating the number of runs contributed by a player and adding the number of runs denied by that players defensive contributions. 4. So what are the issues? Weve confused players with managers. The metrics are problematic. Weve chosen the wrong focus for policy. 5. Baseball hasnt found a We assume the statistics methodology to effectively applied to players (teachers) apply sabermetrics to can be applied to their managers. (principals). managers 6. How does it work in classrooms?Brians projection gains on this for A gain students is estimated springs tests are compared to this his students. This projection may projection. If the gains exceed the take into account his students projection, we say Brian poverty past performance, their state Brians students took theproduced value-added. rate, and a variety of other factors. exam last springValue-added methodologies attempt to isolate a teachers contribution to learning by measuring student growth while controlling or eliminating factors that influence growth but are outside the teachers control, such as student poverty. Last springThis spring 7. How does it work in classrooms?+ .25Brians students gains on this Brians gain is compared to that of springs tests and he is typically other teachersare compared to this projection. If score, a exceed the assigned a z the gains metric that projection, we say Brian produced shows where he stands relative to value-added. other teachers in the state.Last springThis spring 8. How are principals different?They dont directly deliver instruction to students.Their impact cannot easily be measured within a school yearSource: Lipscomb, S.; Teh, B.; Gill, B.; Chiang, H.; Owens, A (2010, Sept.). Teacher and Principal ValueAdded: Research Findings and Implementation Practices. Cambridge, MA. Mathematica Policy Research. 9. Three schools value-added math and reading results who is the better principal? MathReading21.5 1Many state assessment systems use a single year of data for principal evaluation.0.5 0 -0.5 -1 -1.5Langston Hughes ElemScott Joplin ElemLewis Latimer Elem 10. Langston Hughes Elementary MathReading2 1.5 1 0.5 0 -0.5 -1 -1.5 -22009-10 High Growth but not improving2010-112011-122012-13 11. Scott Joplin Elementary MathReading2 1.5 1 0.5 0 -0.5 -1 -1.5 -22009-10 Below average growth, improving but decelerating2010-112011-122012-13 12. Lewis Latimer Elementary MathReading2 1.5 1 0.5 0 -0.5 -1 -1.5 -22009-10 Below average growth, improving and accelerating2010-112011-122012-13 13. So what are the issues? Weve confused players with managers. The metrics are problematic. The metrics are problematic. Weve chosen the wrong focus for policy. 14. How does it work in baseball?In baseball, each player creates his own metrics by getting on base, stealing bases, or making catches. The metrics directly reflect their performance. 15. Issues in the use of growth and valueadded measuresDifferences among value-added models Los Angeles Times Study Los Angeles Times Study #2 16. Issues in the use of value-added measuresControl for statistical error All models attempt to address this issue. Nevertheless, many teachers value-added scores will fall within the range of statistical error. 17. Issues in the use of growth and valueadded measuresControl for statistical error New York CityNew York City #2 18. What Makes Schools Work Study - Mathematics Value-added indexwithin Group 15.0Year 210.05.00.0-5.0-10.0 -10.0-8.0-6.0-4.0-2.00.02.04.06.08.010.012.0Year 1 Data used represents a portion of the teachers who participated in Vanderbilt Universitys What Makes Schools Work Project, funded by the federal Institute of Education Sciences 19. Metrics matterNCLB metrics influenced educator behavior for a decade. 20. Metrics drive behaviorThe term bubble kid had a different meaning prior to 2000. 21. One districts change in 5th grade math performance relative to Kentucky cut scoresMathematicsNumber of StudentsNo Change Down UpFall RIT 22. Metrics drive behavior Race to the Top changes the focus. 23. Number of students who achieved the normal mathematics growth in that districtMathematics Failed growth target Number of StudentsMet growth targetStudents score in fall 24. Gaming distorts resultsTesting conditions may be gamed to inflate results 25. Test duration and math growth between two terms in one schools fifth grade Number of minutes for each students first testtest. 50 second140100Minutes30 80 20 60 10Scale score growthThe white line represents the average duration of the 40 second test.12040 020The scale score growth attained by each child.0-10 Test 1 DurationTest 2 DurationScale Score Gain The yellowline represents the average growth for fifth graders in this district 26. Test duration and math growth between two terms in all fifth grades in a district 90.025.080.060.0Minutes15.0 50.0 40.010.0 30.0 20.05.010.0 0.00.0 Test 1 DurationTest 2 DurationScale Score GrowthScale score growth20.070.0 27. Test duration and math growth between two terms in all fifth grades in a district 120.0018.0 16.0100.00Minutes80.0012.0 10.060.00 8.0 40.006.0 4.020.00 2.0 0.000.0 Test 1 DurationTest 2 DurationScale Score GrowthScale score growth14.0 28. The problem with spring-spring testingStudents spring to spring growth trajectoryTeacher 1 3/124/125/12Summer 6/127/128/12Teacher 2 9/1210/1211/1212/121/132/133/13 29. Metrics do not provide a complete picture of the classroomThey dont capture important noncognitive factors that impact learning. 30. The intangiblesIn baseball, the employment of sabermetrics has reduced the impact that a players intangibles has on personnel decisions. These intangibles may include leadership qualities, locker room presence, and other personality traits that may contribute to team success. 31. Non-cognitive factorsIn education, value-added Jackson (2012) argues that measurement have more teachers may has focused In on non-cognitive policy-makers on the impact baseball, the employment of teachers contribution to to factors that are essential sabermetrics has academic success, as focused student success like general managers reflected in test scores.on the attendance, grades, and players suspensions.contribution to the measures that These are not the only ultimately matter measures that matter in the sport, however. runs and wins. 32. Non-cognitive factors Lowered the average student absenteeism by 7.4 days. Improved the probability that students would enroll Employing value-added methodologies, in the next grade by teachers had a Jackson found that 5 percentage points. Reduced the likelihood of suspension by substantive effect on non-cognitive 2.8%outcomes that was independent of their .05 Improved the average GPA by .09 (Algebra) or effect on test scores (English)Source: Jackson, K. (2013). Non-Cognitive Ability, Test Scores and Teacher Quality: Evidence from 9th Grade Teachers in North Carolina. Northwestern University and NBER 33. So what are the issues? Weve confused players with managers. The metrics are problematic. Weve chosen the wrong focus for Weve chosen the wrong focus for policy. policy. 34. Policy has focused on dismissal rather than retention.In baseball, exceptional players are much rarer than average ones. Thus it is vital for a team to keep its best players. 35. Employment of Elementary Teachers 200720121538000The elementary school NUMBER OF TEACHERS teacher workforce shrunk by 178,000 teachers (11%) between May, 2007 and 1544300 1544270 May, 2012. 1485600Source1415000 1360380200720082009201020112012Source: (2012, May) Bureau of Labor Statistics Occupational Employment Statistics Numbers exclude special education and kindergarten teachers 36. The impact of seniority based layoffs on school quality In a simulation study of implementation of a layoff of 5% of teachers using New York City data, reliance on seniority based layoffs resulted would: Result in 25% more teachers laid off. Teachers laid off wouldSource standard deviations be .31 more effective (using a value-added criterion) than those lost using an effectiveness criterion. 84% of teachers with unsatisfactory ratings would be retained. Source: Boyd, L., Lankford, H., Loeb, S., and Wycoff, J. (2011). Center for Education Policy. Stanford University. 37. If evaluators do not We must identify also differentiate their identify the the and protect least ratings, then all to effective teachers most effective differentiation with gain credibility teachers to improve comes from the test. public. the profession. 38. Results of Tennessee Teacher Evaluation Pilot 60% 53% 50% 40%40% 30% 24% 20%23%22%16% 12% 8%10%2%0%0% 12Value-added result34Observation Result5 39. Results of Georgia Teacher Evaluation Pilot Evaluator Rating 1%2%23%ineffective Minimally Effective Effective Highly Effective75% 40. Ratings under new Florida teacher evaluation regulations Florida Teacher Evaluation Rating 8074.67061.960 50 403036.9 22.620 100.9 2.10.2 0.50.1 0.2Needs Improvement3 Year DevelopingIneffective0 Highly EffectiveEffective2011-122012-13 41. Its good to learn from past failures. 42. Whats the analogy to schools?Policy makers believe valueadded metrics provide a statistical means to measure the effectiveness of teachers and principals. 43. Whats the assumed parallel to schools?Policy Policy-makers assume that reading and mathematics constitute adequate measures of effectiveness. Policy-makers assume that the principal controls the acquisition and deployment of talent. 44. The Cincinnati Approach - Method Evaluators were trained and calibrated to the Danielson model Both peer and administrator evaluators were used. Each teacher was observed three times by a peer and once by an administrator. Stakes were higher for beginning teachers than veterans. Source: Taylor, E. and Tyler, J. (2012, Fall). Can Teacher Evaluation Improve Teaching? 45. The Cincinnati Approach - Findings In the first year, the average teacher improved student math scores by .05 SD, in subsequent years this improved to .11 SD, Improvement was sufficient to move a 25th percentile teacher to near average. Reading scores did not improve. The evaluations retained a leniency bias typical of other evaluation programs. The pilot cost was high, $7,500 per teacher. 46. The Cincinnati Approach - Context In the first year, the average teacher improved student math scores by .05 SD, in subsequent years this improved to .11 SD, Gains in the first two years of teaching are typically .10 SD in mathematics (Rockoff, 2004). Gains from being placed with highly effective peers are .04 SD in mathematics (Jackson and Bruegmann,). The pilot cost was high, $7,500 per teacher.Rockoff, J. E. (2004) The Impact of Individual Teachers on Student Achievement: Evidence from Panel Data. American Economic Review. 94(2): 247-252. Jackson, C. K. and Bruegmann, E., Teaching Students and Teaching Each Other: The Importance of Peer Learning for Teachers (2009, July). NBER Working Paper No. 15202 JEL No. I2,J24 47. Reliability of a variety of teacher observation implementations Observation byReliability coefficient (relative to state test value-added gain)Proportion of test variance explainedPrincipal 1.5126.0%Principal 2.5833.6%Principal and other administrator.6744.9%Principal and three short observations by peer observers.6744.9%Two principal observations and two peer observations.6643.6%Two principal observations and two different peer observers.6947.6%Two principal observations one peer observation and three short observations by peers.7251.8%Bill and Melina Gates Foundation (2013, January). Ensuring Fair and Reliable Measures of Effective Teaching: Culminating Findings from the MET Projects Three-Year Study 48. Assessment Literacy in a Teacher Evaluation FrameworkPresenter - John Cronin, Ph.D. Contacting us: Rebecca Moore: 503-548-5129 E-mail: [email protected] PowerPoint presentation and recommended resources are available at our SlideShare website: 49. Why its time we stopped pretending schools should be managed like baseball teams 50. Suggested reading Baker B., Oluwole, J., Green, P. (2013). The legal consequences of mandating high stakes decisions based on low quality information: Teacher evaluation in the Race to the Top Era. Education Policy Analysis Archives. Vol 21. No 5. 51. Thank you for attendingPresenter - John Cronin, Ph.D. Contacting us: NWEA Main Number: 503-624-1951 E-mail: [email protected] The presentation and recommended resources are available at our SlideShare site: http://www.slideshare.net/NWEA/tag/kingsbury-center 52. What about principals?The issue is the same with principals, it is difficult to separate the contribution of the principal to learning from the contribution of teachers. Source: Lipscomb, S.; Teh, B.; Gill, B.; Chiang, H.; Owens, A (2010, Sept.). Teacher and Principal ValueAdded: Research Findings and Implementation Practices. Cambridge, MA. Mathematica Policy Research. 53. How does it work in classrooms?+ .25Two very important assumptions The teacher directly delivers instruction that causes learning! Last spring The teachers impact can be measured within a school year!This spring 54. Four issues How do you measure a principal? How accurate and reliable are these measures? What anticipated and unanticipated impacts do your measures have on behavior? Where should our energy really be focused? 55. Its good to learn from past failures. 56. So what are the issues? Weve confused players with managers. Weve metrics are problematic. The confused players with managers. Weve chosen the wrong focus for policy. 57. How does it work in education?Teacher or School Value-Added How much academic growth does a teacher or school produce relative to the median teacher or school? 58. So what are the issues