General questions about the Framework
When I wrote Enhancing Professional Practice in 1996, I intended it to be a definition of good teaching, in all its complexity. I hoped (and wrote) that it might be useful for any number of purposes: first, and most importantly, for teachers’ own self assessment and reflection; for teacher preparation, recruitment and hiring, mentoring and induction; for professional development; and yes, also teacher evaluation. The latter was simply one of many uses to which it could be put.
However, that’s been a source of some tension, since many educators have made evaluation its first use. And in the last few years, with the new emphasis on teaching evaluation, people have had to base their evaluations on something. I’d prefer that they use my Framework for that purpose, rather than some other instructional model that’s not as well researched or well accepted. That’s not to say that I approve of every evaluation system that’s been developed, particularly if it’s seen as something that’s done to teachers, rather than as a collaborative effort.
First, the Framework for Teaching (FFT) is a valid instrument for defining effective teaching. Several large research studies (the MET project, a study in Chicago) demonstrated its predictive validity: that is, when teachers demonstrate high levels of proficiency on the FFT, their students show greater learning gains than do the students of teachers who perform less well. The latest edition of the FFT (2013) incorporates the instructional implications of the Common Core State Standards.
Second, the Framework for Teaching represents what Lee Shulman has called the “wisdom of practice.” That is, when teachers consider the FFT in light of the complex work they do, it makes sense to them and illuminates some of the complexity. Furthermore, the levels of performance represent a natural progression for teachers as they acquire greater experience and expertise.
Third, the Framework for Teaching is supported by a large ecosystem of training and online materials. These comprise face to face training for all educators, training and assessment of observers, and published resources to support observation, evaluation, and professional development. These are available through the Danielson Group and other partners.
In general, I discourage educators from making revisions to the Framework, since that can jeopardize its validity. The language in the levels of performance for the FFT has evolved since 1996; it has become more precise and tighter, with clearer distinctions between the different levels. I am aware of how challenging this is to do well, and I advise practitioners to adopt the FFT as it stands. If people want to customize it for their own setting, my advice is to add possible examples to illustrate practice in that setting.
The FFT is intended to apply to all disciplines, K12. That is grounded in the simple fact that teaching, in whatever context, requires the same basic tasks, namely, knowing one’s subject, knowing one’s students, having clear outcomes, establishing a culture for learning, engaging students in learning, etc. The details of how each of those things is done, naturally, is highly level and discipline specific, and requires expertise on the part of teachers in these settings. But in general, the FFT is intended to apply equally to primary mathematics and high school studio art. It’s certainly true that the details of teaching in, for example, world languages and fine arts, are different from the details in primary reading, but it’s also the case that the details of primary reading are different from those of middle school mathematics, or high school history. That is, while teaching is highly contextualized, the basic work of teaching is universal.
- All populations?
Every class includes students of a wide variety of backgrounds, cultures, native languages, and knowledge and skill; indeed it is one of the greatest challenges of teaching to create learning experiences that address this wide variation. Some educators, in particular, are concerned about the applicability of the FFT to classrooms of students with special needs; they fear that their performance could never be judged to be at the highest level because of the limitations of their students. This is an understandable concern. However, the FFT should be considered in light of student characteristics of any group. For example, what constitutes a high level question for a special needs student is a different question than that for a regular education student, but it is higher order for that student.
- Teachers who teach many students, for example vocal music?
There’s no doubt that the Framework for Teaching must be considered in light of the “context” of the classrooms in which teachers are being observed, and that “knowing one’s students” is different, in practice, when a teacher teaches hundreds (as in music, PE, or art) from what it might be in a primary classroom with, say, 23 students. As in other aspects of using the FFT, it’s important for common sense and reason to prevail. Therefore, a vocal music teacher might know that the alto section is coming in too early at a specific point in a piece of music. That same teacher might also know, however, that a particular student has a strong voice that might be suitable for a small solo role. But much of the teacher’s knowledge of students will be, inevitably, group-based.
- Non-classroom specialists, for example school nurses, librarians, counselors, etc.?
Non classroom specialists (nurses, etc.,) typically do some teaching, but they also normally do many other things as well; for example, school nurses may dispense medications. That is, the work they do is somewhat different from that of classroom teachers, and they need their own frameworks. In the second edition of the Framework for Teaching (ASCD, 2007) I drafted specialist Frameworks, and I encourage educators to use or adapt them. It’s important to recognize, however, that they do not have the extensive validity research as the FFT has for teaching, and it’s hard to envision how that might be done.
While most educators have opinions on this subject, there is no research (at least not yet) to suggest that any of the components in the Framework for Teaching are more important than others. However, in the MET study, teachers’ ratings were higher, in general, for the components in Domain 2 than they were for Domain 3, suggesting that getting the classroom environment “right” is a prerequisite to serious attention to instruction, and might explain why many mentoring programs begin with the procedural and management concerns of first year teachers. However, that does not imply that Domain 2 is more important that Domain 3, only that it develops earlier. Skill in Domain 3 is absolutely critical for promoting student engagement and learning, and while it may develop after Domain 2, teaching cannot be considered “good” until teachers perform the Components of Domain 3 at a high level.
The Framework for Teaching is definitely not a checklist of specific behaviors. For example, in 2a, (creating an environment of respect and rapport) there are many ways teachers create an environment that’s safe for students to take intellectual risks, in which they respect the contributions of their classmates, etc. – not just one practice that every teacher should demonstrate. The same can be said for each of the 22 components... you can illustrate this by picking one, consider how you would demonstrate proficient or distinguished level performance, and then ask whether there are other things you (or someone else) might do that would also fit that description. The answer is certainly that there are “lots of ways to be good.”
General Questions about Teacher Evaluation
Teacher evaluation serves two essential purposes: quality assurance and promoting professional learning.
- For quality assurance, educators must be able to ensure parents and the larger community that they are well equipped to do the essential work of educating children. As an essential component of an education system, the quality of teaching must be high.
- For promoting professional learning, a highly evolved system produces information about teachers’ strengths and weaknesses, and can therefore point the teacher toward areas for growth.
There is, inevitably, some tension between these two purposes; a system of accountability can feel like an “inspection” to teachers, while one entirely focused on professional learning can result in under performing teachers not receiving important information about their teaching. My recommendation in resolving this tension is to establish clear standards as to the level of teaching expected for teachers with different levels of experience, and once teachers have demonstrated that level of proficiency, concentrate all the efforts in the system to promoting ongoing learning.
There are two principal indicators of teacher effectiveness: teacher practices and the impact of teachers’ work on student learning. Most evaluation systems use a combination of these indicators, in what they call “multiple measures,” which sometimes includes student perception surveys.
My area of expertise is in establishing teacher practices that are found to produce high levels of student learning; that’s what the Framework for Teaching represents. There are many challenges inherent in the use of value added test score data to evaluate the effectiveness of individual teachers, but I’ll leave those challenges to the measurement experts.
Domains 2 and 3 of the Framework describe classroom practice, and can be assessed through observations of teaching. These observations can (and I think should) be supplemented by samples of student work, which also provide another indication of student engagement in challenging work.
Domains 1 and 4 represent “behind the scenes” work, important to good teaching, but not directly observable in the classroom. Observers can, occasionally, obtain indirect evidence of Domain 1 (Planning and Preparation) during an observation, but more direct evidence is obtained from planning documents, and a pre-observation (planning) conference. As for Domain 4 (Professional Responsibilities), there is rarely evidence for that in an observation, simply because those activities don’t happen in the classroom. Domain 4 is best assessed through the examination of artifacts that illustrate the teacher’s skill in the different components of the Domain. And while Domain 1 can, if there’s time to do so, be discussed in reference to every (announced) observation, I recommend that the components of Domain 4 be assessed annually.
This is a very challenging question, and one without an easy answer. It is certainly true that many aspects of teaching (particularly those in Domain 2) are generic, and can be observed in a class regardless of the subject being taught, and by an observer without content expertise. However, it’s also true that in advanced subjects, or at the higher levels of performance in all subjects, that content and content-specific pedagogy matter; if observers don’t have that expertise, it’s difficult for them to be aware of the nuances in a teacher’s practice.
If a school is fortunate enough to have content-area supervisors available, those individuals can be enormously helpful in observing teachers, and reviewing planning documents, simply for the accuracy of the content and the wisdom on content-specific pedagogies. If such individuals are not available, then I recommend a conversation with the teacher with questions designed to elicit evidence of expertise, such as, in world languages, what approaches they have found effective in helping their students acquire a good accent, or how this topic in science is related to the one they are exploring with their students last week.
Training in the Framework for Teaching (and conducted, if possible, in groups with both teachers and administrators present) serves to establish a common language about good teaching, and invites important conversations regarding practice. The importance of the common language cannot be over emphasized; it’s one of the aspects of adopting the FFT that many educators value the most. Teachers say things like “Finally, I know what my principal is looking for in an observation!” And the common language enables teachers to engage in meaningful discussion both with their colleagues and with supervisors.
In high stakes teacher evaluation, evaluators must make consequential decisions about teachers, decisions that could affect ratings, compensation, or even their continued employment. For that reason, it’s essential that evaluators demonstrate that they can evaluate performance accurately and consistently, and base those judgments on evidence. These skills can be both taught and tested, and in my view a fair system demands that evaluators pass, in effect, a test to demonstrate their skill. After all, it’s impossible get a driver’s license in any state without passing a test; it does not make sense that school evaluators should be able to make high stakes decisions about teachers without demonstrating that they can do so accurately.
A related question relates to the matter of how long such certification is valid, and whether evaluators should periodically engage in recalibration exercises. Research on this point has been limited up until now, although we have common sense to guide us. Common sense would suggest that evaluators should recalibrate at least annually, and recertify once every three years.
Questions about observations of classroom practice
One of the important findings of the MET study was that the reliability of observations increased with both the number of observations and the number of observers. While probably unrealistic for practicing Educators, four observations, conducted by several different observers, was about twice as reliable as an observation of a single lesson. Interestingly, for those components of the FFT that were observed during a lesson segment, 15 minutes gave as high reliability as observations of 45 minutes. From a practical perspective, this means that if both a principal and an assistant principal have been trained and certified as evaluators, the most reliable evaluations are obtained if they conduct observations of the same teachers.
Of course, a full lesson provides a teacher the opportunity to demonstrate his/her skill in planning and executing an entire learning experience for students, and enables the teacher and supervisor to engage in the collaborative observation cycle (see below.)
Overall, my recommendation is that the observation component of a full evaluation consist of one full lesson, and three additional, shorter observations, and that these observations are conducted by two different individuals.
This issue involves trade offs between conflicting purposes of teacher observation. On the one hand, an announced observation, for an entire lesson, gives teachers the opportunity to provide evidence of their skill in planning in a pre observation (planning) conference, and deliver a lesson that represents their best work. On the other hand, some teachers are tempted to do a “dog and pony show” for their announced observation, whereas when administrators conduct a number of shorter, unannounced observations, they can discern patterns in the teacher’s practice. There are strengths in both approaches, which is why I recommend both announced (formal) and unannounced (informal) observations
The Danielson Group recommends a “collaborative observation cycle” process, and it applies to a formal, announced observation. It consists of the following steps:
- A pre observation (planning) conference, following an established protocol, in which the teacher explains what he/she is planning for the students to learn, how the teacher proposes to engage students in the lesson, and how (and when) the teacher will know whether the students have reached the desired outcome.
- A classroom observation, for an entire lesson or class period. The observer takes notes, recording only evidence, and not making any interpretations or judgment of that evidence.
- A period of consolidation, in which:
- the observer shares the notes with the teacher, and the teacher has an opportunity to supplement the observer’s notes if they are not complete
- both teacher and observer assign each piece of evidence to a component in the Framework for Teaching. If applicable, a single piece of evidence may be assigned to more than one component
- both teacher and observer determine which level of performance they believe is represented by the evidence for each component, and why they think that. A recommended technique is for each individual to use a highlighter (either on paper or electronically) on the rubric to represent the words that best characterize the evidence for each component
- a post observation (reflection) conference in which the teacher and observer compare their interpretations of the evidence for each component, and together decide the appropriate level of performance for each component. If they disagree, the observer’s judgment must prevail, but the observer should have sufficient humility to recognize that the teacher’s interpretation may be the correct one
- together, the teacher and observer identify the strengths of the lesson, the
areas for growth, and recommended actions the teacher might take to
address the areas for growth.
In a full lesson, a teacher will demonstrate most of the components in Domains 2 and 3. However, this is not always the case, when, for example, the lesson does not include the need for a teacher to make an adjustment to the lesson (3e.) Furthermore, in a brief observation, there may be no evidence for several components. For example, it’s possible that in the particular 15 minutes observed, the students were not engaged in a discussion (3b) or there were no transitions or other evidence of classroom procedures (2c.) In situations where there is no evidence for a component, it should be recorded as “no evidence;” the lack of evidence should not result in a low score for that component.
Ideally, teachers should be observed in the full range of situations in which they teach; an elementary teacher might be far more effective in, for example, mathematics, than in literacy. Or a high school science teacher might do a better job in physics, for example, than biology. The observation of a single lesson would not be sensitive to these differences. On the other hand, if several observations (even brief ones) are conducted with each teacher (ideally by several different observers) then the teacher can demonstrate the full range of his/her skills.
Video is a powerful technology, useful for a number of purposes, primarily professional development. New technological developments enable teachers to videotape their own teaching, review what they’ve captured, and either delete it or share it with colleagues. This practice can make a significant contribution to such practices as Lesson Study, and generally enrich the work of professional learning communities. Furthermore, the software allows teachers to share just a small clip of a lesson (where, for example, they’re trying a new approach), make a comment on that clip, and share just that with colleagues, making the entire process efficient.
Similarly, video can be used for professional development in conversations between a teacher and a supervisor. In such discussions, they watch a lesson together, pause it at different points, and explore different possible courses of action. When used in this manner, the video becomes a tool in solving “problems of practice” that contribute to the complexity of teaching.
Video can also be used in teacher evaluation, in several different ways:
- As a tool for observers to maintain their accuracy. Observers can look at the video of a class, and compare, with one another, the evidence they collect for the different components, and how they interpret that evidence against the levels of performance. When used in this manner, that video should not be used as part of the evaluation of the teacher, but only for ongoing training and calibration of observers.
- As a “second opinion” on the quality of a lesson. If an observer and a teacher disagree as to how a lesson should be evaluated, the only records they have of the lesson are the observer’s notes and the teacher’s memory; both can be flawed. A videotape of that same lesson can also be viewed by another trained and certified observer (even in a remote location) as another pair of eyes on the same events.
Teacher observation and evaluation, when done with paper and pencil, generates a voluminous amount of data; keeping track of this information, and organizing it so patterns are revealed, is just what computers are good for.
Specifically, the use of computer technology enables observers to record the events of a
lesson, assign these notes to a component of the Framework for Teaching, share those notes with the teacher, and determine the level of performance. The evaluator can also review artifacts the teacher has submitted, primarily for Domains 1 and 4, and discuss them with the teacher. The teacher can do all those same things, that is, submit artifacts (for example the lesson plan for a lesson, or examples of family communication), review the observer’s notes and their alignment with components in the FFT, and determine an appropriate level of performance in preparation for the post observation (reflection) conference with the evaluator.
Overall, the use of technology minimizes the time needed for the mundane aspects of teacher evaluation, that is, writing in long hand and organizing massive amounts of data, and maximizes the time available for the important part of the process, namely the professional conversations.
Questions about the evaluation of Domains 1 and 4
Domains 1 and 4 represent the “behind the scenes” work of teaching: essential for accomplished practice but not visible in the classroom. This means that a teacher’s skill in these domains must be demonstrated through artifacts, planning documents for Domain 1 and artifacts reflecting a teacher’s professionalism for Domain 4.
In the book The Handbook for Enhancing Professional Practice: Using the Framework for Teaching in Your School (ASCD, 2008) I described, in considerable detail, the sources of evidence for each of the components in the FFT (pp 13-16), and offered sample directions and “scoring guides” for artifacts that could serve to provide evidence for Domain 1 and 4. (pp. 144-165.) But it’s important to bear in mind that while the components in Domain 4 are quite distinct from one another, and thus need to be demonstrated separately, those in Domain 1 are highly intertwined. Therefore, for Domain 1, teachers can submit a single document, for example a Unit Plan, depending on its level of detail, and provide evidence of all of the components of Domain 1.
Many of the components of Domain 1 (particularly 1c,1e, and 1f) can be assessed for a single lesson, and demonstrated through a lesson plan and a pre observation (planning) conference. Naturally, this applies only to a formal, announced observation, since an unannounced observation does not include a pre observation conference. However, to include the evaluation of Domain 1 in every observation adds an unnecessary burden, for both the teacher and the evaluator; therefore, I recommend that Domain 1 be assessed annually. The same is true for Domain 4, but for slightly different reasons. Teachers don’t “demonstrate” Domain 4 in the context of each lesson – with the possible exception of 4a, which is revealed in a post observation (reflection) conference – so I recommend an annual conference between a teacher and the evaluator to examine the artifacts that the teacher has assembled as evidence of Domain 4 (examples of record keeping systems, communication with families, professional development activities, involvement with colleagues, etc.) Planning documents (for example, a unit plan) can be examined at the same time.
Questions about making evaluative judgments based on observation data and examination of artifacts
For the observable domains, that is, Domain 2 and Domain 3, the answer to this question depends on how many observations have been conducted, whether they are announced or unannounced, and whether the information from each observation is captured electronically or only on paper. Ideally, an observer can look at the assessments from each observation, and examine them as a whole; this is done most readily if the information has been stored electronically. But, regardless of the technology used, the observer must consider the “preponderance of evidence” to determine the level of performance for each component. Alternatively, the process can specify that the ratings for each of the observations are averaged to arrive at a mean or median score for the component. The same process can be followed for Domains 1 and 4: following an examination of the artifacts for each component, a judgment is made linking the evidence to the statements in the levels of performance.
It should be remembered, however, that the score resulting from an average of scores on individual components of the FFT is just that, an average of performance. It, in itself, does not constitute an evaluative judgment about that teacher. For example, if a teacher’s performance improves over the course of a school year in some aspect of teaching, the evaluator might want to consider that improvement when making the final, evaluative, judgment. (See next question/answer.)
The headings in the levels of performance in the Framework for Teaching (unsatisfactory, basic, proficient, and distinguished) are descriptive words – that is, they don’t on their own, make a judgment; they merely describe the practice. On the other hand, words used to evaluate teachers (words like ineffective, needs improvement, effective, and highly effective) are judgmental words; they are used to evaluate. Many educators are inclined to simply equate the descriptive words with the evaluative words, and mandate, for example, that in order for a teacher to receive an “effective” rating, all the components must be rated at the “proficient” level. Some systems even replace the FFT descriptive words (basic, etc.) with the judgment words (effective, etc.)
I recommend that school districts (or states, if the decisions are made at that level) use different words for the evaluative judgments made regarding teachers from the words used for the levels of performance of practice (such as unsatisfactory, etc. in the Danielson Framework). In that case, evaluators must be able to translate from one to the other.
This requires the application of an algorithm, typically specified by the state, or district administration.
Questions about Resources
Yes, both the 2013 and 2011 versions of the Framework for Teaching Evaluation Instrument are available in a PDF format from the Danielson Group website. Any educator may download this file and use the print version for his or her own personal use. Click here to see the free downloads available on the Framework page.
However, neither version of the Framework for Teaching Evaluation Instrument (2011 and 2013) may be incorporated into any third party software system. Acceptance of either document constitutes agreement not to: (i) copy, modify, translate, or create derivative works from the instrument; (ii) load, integrate or incorporate any portions of the instrument with any other publication, software, database or other content or work; (iii) transfer the instrument to any third party for commercial use; or (iv) remove or alter any notices in the instrument. The Framework for Teaching Evaluation Instrument (2011 and 2013) is provided on an as-is basis and no warranties are expressed or implied.
The following publications are available from the ASCD website as PDF e-books:
- Enhancing Professional Practice: A Framework for Teaching, 2nd edition by Charlotte Danielson (#106034)
- Enhancing Student Achievement: A Framework for School Improvement by Charlotte Danielson (#102109)
- The Handbook for Enhancing Professional Practice: Using the Framework for Teaching in Your School by Charlotte Danielson (#106035)
- An Introduction to Using Portfolios in the Classroom by Charlotte Danielson (#197171)
- Teacher Evaluation to Enhance Professional Practice by Charlotte Danielson and Thomas L. McGreal (#100219)
- Teacher Leadership That Strengthens Professional Practice by Charlotte Danielson (#105048)