Conducted by Michael S. Garet, Andrew J. Wayne, Seth Brown, Jordan Rickles, Mengli Song, and David Manzeske.
Purpose of study
To examine whether the intervention of teacher performance measures using the Classroom Assessment Scoring System (CLASS) and Framework for Teaching (FfT) affected classroom practice and student achievement when implemented in the study districts.
- To what extent were the performance measures and feedback implemented as planned?
- To what extent did the performance measures identify more and less effective educators and signal dimensions of practice that most needed improvement?
- To what extent did educators’ experiences with performance feedback differ for treatment and control schools?
- Did the intervention have an impact on teacher classroom practice and principal leadership?
- Did the intervention have an impact on student achievement?
Data sources included teacher attendance at orientation and training events, the frequency of classroom observations and feedback sessions, surveys of teachers and principals that asked about their perceptions of the intervention, surveys of educator experiences with performance feedback and desire to improve, observations of teachers’ classroom practice coded using CLASS and FfT protocols, teacher surveys of principal instructional leadership and teacher-principal trust, and students’ standardized test scores in reading and mathematics. Classroom practice scores were based on observations conducted during four “windows” each year. One observation was conducted by a school administrator and three by hired observers. After each observation, the observer prepared a report with ratings and narrative justification to discuss with the teacher during a feedback session.
The study used an experimental design in eight purposefully selected districts that with at least 20 elementary and middle schools, data systems that were sufficient to support value-added analysis, and performance measures and feedback that were less intensive than those implemented for of the study. Four of the eight districts chose to use the Classroom Assessment and Scoring System (CLASS) and the other four chose Charlotte Danielson’s Framework for Teaching (FFT). Each district identified elementary and middle schools with English language arts and mathematics teachers of grades 4-8 to participate in the study. Sixty-three treatment schools and 64 control schools participated.
- The study’s measures were generally implemented as planned. For instance, teachers in treatment schools received an average of 3.7 and 3.9 observations with feedback sessions in Years 1 and 2, respectively.
- The study’s measures provided some information to identify educators who needed support, but limited information on the areas of practice educators most needed to improve. For example, although most teachers (more than 85 percent) had overall classroom observation scores in the top two performance levels, scores averaged over the year provided some reliable information to distinguish teacher performance. In Year 2, for example, depending on the assumptions used, reliability estimates for the four windows average overall scores were between .70 and .77 for the FFT. This implies that 70 to 77 percent of the variation was due to persistent variation in the quality of teacher practice, and the rest (23 to 30 percent) was due to measurement error. Differences in teachers’ observation ratings across dimensions, however, had limited reliability to identify areas for improvement.
- As intended, teachers and principals in treatment schools received more frequent feedback with ratings than teachers and principals in control schools. Treatment teachers reported receiving more feedback sessions on their classroom practice with ratings and a written narrative justification than control teachers. Treatment principals received more instances of oral feedback with ratings on their leadership than control principals. A majority of treatment teachers said the study’s feedback on classroom practice was more useful and specific than the district’s existing feedback. For example, about 65 percent of teachers reported that the study’s feedback was more useful than their district’s, and 79 percent reported that the study’s feedback was more specific about what constitutes high-quality teaching
- The intervention had some positive impacts on teachers’ classroom practice, principal leadership, and student achievement. To assess the impact on classroom practice, the study team video-recorded lessons in both treatment and control schools and coded them with the two observation rubrics used to provide feedback. The intervention had a positive impact on teachers’ classroom practice on the CLASS rubric, moving teachers from the 50th to the 57th percentile, but it had no impact on practice as measured by the FfT. The intervention also had a positive impact on the two measures of principal leadership examined— teacher-principal trust and instructional leadership. The intervention had a positive impact on teacher-principal trust in Year 1 and on both instructional leadership and teacher-principal trust in Year 2. In Year 1, treatment principals, on average, received a score of 3.18 on the 5-point teacher-principal trust scale, compared with 2.96 for control principals. In Year 1, the intervention had a positive impact on students’ achievement in mathematics, amounting to about four weeks of learning. In Year 2, the impact on mathematics achievement was not statistically significant. The intervention did not have a statistically significant impact on reading/English language arts achievement.
Treatment teachers discussed topics covered on the CLASS and FFT with observers more often than control teachers did, but were neither more likely to say that they wanted to improve their skills in these areas nor to participate in relevant professional development than control teachers. The feedback teachers in the treatment group received also did not alter their ratings of their own effectiveness. A similar pattern emerged among principals in the treatment schools.
The study’s theory of action posited that teacher performance feedback would raise student achievement by improving classroom practice and principal leadership. Exploratory analyses indicated that classroom practice was positively associated with student achievement in mathematics, suggesting that improved classroom practice may have been one way that feedback affected achievement. Similar exploratory analyses found no association between principal leadership and student achievement.
The Framework for Teaching was one of the measures of teacher performance used in this study to provide feedback on classroom practice. The study’s theory of action assumed that performance feedback for educators would improve student achievement by improving teachers’ practice and principals’ leadership. The study was not designed to provide a rigorous causal test of this assumption. However, exploratory analyses indicate that classroom practice, using the study’s outcome measure based on video-recorded lessons coded with the CLASS and the FFT, was positively associated with student achievement in mathematics and reading, suggesting that improved classroom practice may have been one way feedback boosted achievement.