Excellence In Teaching Project

October 7, 2009

Conducted by Lauren Sartain, Sara Ray Stoelinga, and Eric Brown. (2009)

Purpose of study

To describe the initial results of a pilot teacher evaluation and improvement program aiming to provide a common definition of effective teaching, guide meaningful discussion and collaboration around teaching practice, and direct teacher development along a continuum that will help teachers have a greater impact on student learning.

Research questions

  • What are the technical properties—including reliability and validity—of the research tool itself?
  • How do principals perceive the utility of the new teacher evaluation system? Does it help achieve the stated goals?
  • How do teachers perceive the utility, fairness, and helpfulness of the new evaluation system? What components of the evaluation system are the most or least helpful? Are the pre- and post-conferences useful?
  • What supports are in place in the pilot year, are they effective, and how well are the goals and procedures communicated across the evaluation system?
  • Does the new system have the desired effect at the school level, including shaping the professional development, professional culture, teacher hiring, the quality of teaching, and student learning?


Forty-four schools within 4 areas of the Chicago Public School district (approximately half of the schools within these areas) were randomly selected to participate in the 2008-2009 pilot of the new teacher evaluation system using the Danielson Framework for Teaching (FfT). Observations focused on beginning teachers and those who received low ratings under the prior checklist system. Administrators and external observers completed 277 matched observations of elementary and middle school teachers of English language arts, mathematics, science, social studies, and other.

Major results

  • The FfT has the potential to identify strong and weak teachers reliably. Administrators would benefit from additional support and training on certain levels (Basic and Proficient) and components of the Framework, especially in Domain 3 (Instruction):
    • 3a Communicating with Students
    • 3c Engaging Students in Learning, and
    • 3d Using Assessment in Instruction.
  • Teachers tended to have more difficulty implementing the instructional aspects of the Framework than those related to classroom management. Teachers struggled most with components 3b (Using Questioning and Discussion Techniques) and 3c (Engaging Students in Learning). They received the highest ratings on components 2e (Organizing Physical Space) and 2a (Creating an Environment of Respect and Rapport).
  • Framework ratings were not used for summative teacher evaluation during the pilot year of the initiative, but district officials did consider setting the performance benchmark at Basic proficiency. This would increase the proportion of teachers classified as low-performing from 0.3% to 8%.
  • Principals and teachers expressed positive feedback regarding the quality of the Framework and its ability to measure teaching performance accurately.
  • Teachers were less positive about pre- and post-conferences associated with their classroom observations, citing concerns about the time commitment and implementation of conferences.
  • The majority of principals and teachers expressed positive opinions about the trainings they received on the FfT.
  • Principal buy-in was categorized along 4 themes:
    • Those who felt a “paradigm shift” in their ideas about evaluation and appreciated the Framework’s increased objectivity and attention to specific teaching skills;
    • Those with high enthusiasm who perceived strong teacher buy-in along and substantial changes to classroom practice as a result of the new evaluation system;
    • Those who expressed mixed emotions about numerous initiatives underway and the labor-intensive nature of FfT observations; and
    • Those with low enthusiasm who felt they were already doing the right type of evaluation or that they “just knew” their teachers’ abilities, felt evaluations had little influence on instructional practice, and worked with low buy-in teachers.
  • Over half of principals reported high buy-in and positive perceptions of the new evaluation system despite the rigorous evaluation process and time commitment.


The pilot evaluation identified opportunities for additional training, including

  • taking notes and translating them into evidence for FfT ratings,
  • learning more about the content of Framework components,
  • facilitating more reflective pre- and post-conferences,
  • deepening understanding of Framework Domains 1 & 4 ,
  • increasing knowledge about the Framework and the Excellence in Teaching pilot,
  • expanding training about the entire evaluation process,
  • digging deeper into challenging components (e.g., 2e, 3a, 3c, and 3d),
  • providing video exemplars of various levels of performance for each component, and
  • managing the time and implementation challenges of the evaluation system.

Most principals stuck to the time limits described for the evaluation process and were highly engaged. However, challenges included higher expectations of teaching practice under the new system, constrained timeframes that limited teachers’ opportunities to reflect and improve their practice, and the scheduling demand of evaluating all teachers within a single school year. Due to challenges with inter-rater reliability and rating certain problematic components, several approaches may be useful in setting benchmarks for summative teacher evaluations:

  • use of a “meets standards”/”does not meet standards” scale,
  • differential weighting of Framework components, including down-weighting challenging components until reliability is improved, and
  • use Framework ratings to identify appropriate supports and professional development for teachers.

FfT focus

The Excellence in Teaching pilot in Chicago used the Danielson Framework for Teaching (FfT) to observe and rate the quality of classroom practice of teachers in the district. Despite some challenges with differentiating levels of proficiency and rating more complex, challenging instructional components, the FfT showed potential for high reliability with additional training and rating supports. Most principals and teachers felt positively about the new evaluation process in terms of its quality and ability to assess teaching practice accurately. Additional training needs and supports for using the Framework ratings for summative as well as formative evaluations are identified.