Examining The Relationship Between Teacher Evaluation And Student Assessment Results In Washoe County

October 7, 2004

Conducted by Steven Kimball, Brad White, and Anthony Milanowski.

Purpose of study

To explore whether the teaching described in the Framework for Teaching-based evaluation system is high-quality teaching that is reflected in measures of student achievement.

Research questions

  • Do teachers who score well on such evaluation systems also help produce higher levels of student learning?

Population/sample

The study was carried out in the large, urban school district named in the title that was implementing a teacher evaluation system based on the Framework for Teaching (Danielson, 1996). The researchers pulled data from third-, fourth-, and fifth-grade teachers with sufficient evaluation data to create composite scores based on four elements of Domains 1 (planning) and 3 (instruction) of the Framework. This consisted of 123 third-grade teachers, 87 fourth-grade teachers, and 118 fifth-grade teachers. Students who had both pre- and post-test scores on reading and math tests for the 2000-01 and 2001-02 school years were linked with their teachers for analyses (approximately 45% of students met this criterion). Reading and math scores were combined into a composite measure of achievement. Student demographics and pre-test scores were used to predict post-test scores at the first level of several hierarchical linear models (HLM), and first-level results were compared with a second level that added teachers’ composite evaluation scores and other potentially relevant characteristics.

Major results

  • Results showed 17% to 27% of the variance in student achievement was attributable to teachers without controlling for students’ prior test scores and demographic characteristics, and 5% to 15% of the variance in student achievement could be attributed to teachers after accounting for these factors.
  • Teacher evaluation scores were a statistically significant predictor of student achievement in four of nine models that accounted for student achievement and demographic characteristics. In practical terms, results suggested that a 1-point increase in a teacher’s evaluation scored corresponded to a 5.41-point increase on the fourth-grade reading assessment. However, teacher evaluation scores were not associated with student achievement in math in the fourth-grade model. In fifth grade, higher teacher evaluation scores predicted higher achievement in reading and math, but in third grade, teacher evaluation scores were not significantly related to student achievement scores.
  • When teacher education and experience, as reflected by placement on the salary schedule, was included in subsequent analyses, it was weakly related to student achievement in fifth-grade reading and math, but not to achievement in either subject in the earlier grades. An extra $1,000 on the salary schedule translated to a half-point increase in test scores.
  • Teacher evaluation scores explained more variance in student achievement scores than teacher education and experience as reflected by their salaries.

Conclusions/recommendations

The authors conclude with several potential explanations for the lack of a strong and consistent relationship between teacher evaluation scores and student achievement in this study. These included 1) measurement error associated with the use of both criterion- and norm-referenced tests of student achievement and the omission of many Framework components in the composite measure; 2) the low-stakes, growth-focused context of the evaluation system and lack of emphasis on rater training and reliability; and 3) the lack of content-specific teaching standards to account for potential differences in effective pedagogy between subject areas.

More research is recommended to track the effects of teaching behaviors on student achievement over a longer span of time, to maximize student achievement data for more robust analyses, to look at the effect of more comprehensive teacher evaluation measures on student achievement, and to determine whether certain evaluators’ scores are more strongly associated with student achievement data than others’ scores, which would suggest that their evidence-collection and decision-making strategies could help other evaluators improve their scoring.

FfT focus

The teacher evaluation system in this district was based on the Framework, and the authors include detailed information about the instrument and its use in the article. Teacher evaluation scores on the Framework-based observational instrument and their students’ average achievement were correlated, if inconsistently, providing validation that the Framework measures high-quality teaching.