Teacher Evaluations In An Era Of Rapid Change: From “Unsatisfactory” To “Needs Improvement”

October 7, 2014

Conducted by Chad Aldeman and Carolyn Chuong.

Purpose of study

To examine what can be learned from efforts to revise teacher evaluation systems between 2010-14 by synthesizing data from 17 states and the District of Columbia.

Research questions

Rather than specifying research questions, the authors reviewed the teacher evaluation data to identify five major trends.


An appendix table (pp 31-2) specifies the data available for each included state and Washington, DC. Available data varied by the school personnel included, years, level of data (e.g., district- versus school-level data), source, inclusion of student growth or learning outcomes, and participation (e.g., pilot versus full implementation). All sites collected data on teachers, and almost 75% involved principals.

Major results

  • Districts have made progress in differentiating between multiple levels of teaching performance, rather than painting all teachers as “satisfactory” or “unsatisfactory.” This greater differentiation can identify educators who would benefit from targeted support.
  • The use of high-quality observational rubrics, such as the Framework for Teaching, provides teachers with more specific, constructive, and timely feedback on their classroom practice. Both principals and teachers have had positive reactions to the observations, reporting more time spent on teacher observation and reflection and more useful feedback on practice.
  • State policy changes have not convinced districts to factor student learning growth into teacher evaluation ratings. Three reactions to policies have included refusing to factor student growth into teacher evaluations, delaying incorporation of student learning into teacher evaluation scores, and obscuring student growth through idiosyncratic district implementation and post-hoc “upgrades” to underperforming teachers.
  • Districts have broad discretion to implement teacher evaluation policies under statewide guidelines, leading to substantial variation in practices between districts in the same state. Districts can choose how components of teacher evaluations are scored, compiled, and weighted in final teacher evaluation ratings. One result of this flexibility is broad differences in teacher evaluation scores between districts in the same state.
  • Districts rarely use teacher evaluation data to make consequential decisions about teacher promotion, compensation, or dismissal. Instead, credentials and seniority—rather than classroom performance—often determines who will receive tenure, promotions, and pay raises. Dismissals continue to be very rare.


The authors make four recommendations:

  • States should collect and publicly report teacher evaluation ratings, including the component elements that make up those ratings, and how ratings are used to drive personnel decisions, to promote transparency and accountability.
  • States should work closely with districts to understand the causes of outcomes and variation between districts and ensure that evaluations are consistently rigorous across schools and classrooms.
  • States should not stop or slow reforms in teacher evaluations before these new policies have a change to take effect.
  • States should expect teacher evaluation reforms to co-exist with other educational reforms, such as the Common Core State Standards.

FFT focus

The review notes that the observational rubrics used for teacher evaluations in Arkansas, Delaware, Florida, Idaho, Illinois, New Jersey, New York, South Dakota, Washington, Cincinnati, Los Angeles, and Pittsburgh are based on Danielson’s Framework for Teaching, a “research-backed” protocol.