Beyond lesson observation grades?

by admin | 19 Jan, 2014 | Advice, Leadership, Teachers | 4 comments

Mary Myatt is a lead inspector for Ofsted, adviser, writer and trainer who blogs at MaryMyatt.com. She kindly agreed to write this blog for us after appearing at a panel at a joint Teach First & Teacher Development Trust event on Lesson Observation. You can see the vidoes here.

If there was any doubt about the inaccuracy of lesson observations, they were squarely squashed by Prof Robert Coe during the lesson observation debate organised by David Weston from the Teacher Development Trust and Sam Freedman from Teach First. It is not possible to judge accurately the quality of teaching from a twenty minutes lesson observation. So why are lessons graded?

To be clear, inspection teams make judgements on aspects of teaching, achievement, behaviour, leadership and management when they visit lessons. And they only make a note of evidence if they find it. But this evidence doesn’t stand alone. It is always, always backed up with further information. These include data on progress, talking with students, the views of students, staff and parents. So the commentary when feeding back to staff should be along the lines of ‘on the evidence of what I observed during the time I was in your lesson, these aspects were outstanding/good/require improvement because..’ It must be made clear that these are not judgements about an individual’s overall practice, but based on what was seen during the observation.

What goes wrong is when schools simply aggregate grades from the lessons observed and state that these equal the quality of teaching over all. This is highly misleading if it hasn’t been triangulated with some of the other measures.

In order to overcome the narrowness of lesson grades here are four suggestions for schools to create evidence which is robust, fair and does not rely on individual lesson observation grades.

One: Improving practice is based on Lesson Study models or on formative conversations involving everyone, including senior leaders on what works. Tom Sherrington has posted about the impact of Lesson Study and he notes the change in dynamics of working in this way. Talking about teaching is a top priority: Alex Quigley has posted about ‘creating a climate where quality teaching is the subject of conversations at all times’ where he draws on the Hattie research and links to a brilliant, short clip of Tim Brighouse describing how this works. Chris Moyse has also written about how his school does not grade lessons, but colleagues work collaboratively and formatively to improve practice.

Two: Carefully considered student perception surveys are in place. This was one of the most interesting things to come out of the the TDT/TF debate. There is an emerging body of research which shows that one of the most accurate indicators for the quality of teaching comes from students. It is possible to do this and get high quality responses where surveys are carefully constructed. So that there is no room for personal, irrelevant comments.

While a great deal of work was done on student voice several years ago by amongst others, Professors David Hargreaves and Jean Ruddock, it is now being replicated by the Measures of Effective Teaching project. If schools were prepared to trial some of these surveys, summarise the results and show how they were taking the key messages on board, it would provide another plank of evidence for securing a judgement on the quality of teaching.

Three: Video lessons are a part of practice. The more we do anything, the better we get at it. And that includes observing lessons. We need to be watching other people’s practice, ideally through lesson study type models and through watching videos of others teaching. There are plenty of examples. And by extension, it is a good idea for individuals to watch video footage of their own practice. John Tomsett has written a commentary on his videoed lesson. How would it be if there were a summary of samples of lessons which had been videoed? Again, it would create another solid basis for reaching a judgement.

Four: An ongoing professional interest into what the research is saying. These may be large or small scale, but they are considered, discussed and elements selected which are appropriate to the school’s context. Some of these might be the work done by the Education Endowment Foundation, Hattie’s Visible Learning the Teacher Development Trust’s National Teacher Enquiry Network. The concise summary for instance of an admittedly small-scale piece of research by Sam Freedman at Teach First on the difference between ‘strong’ and ‘exceptional’ leadership would provide substantial material for discussion on taking school improvements to the next level. If a school is carrying out this sort of practice, it will have an impact on teaching standards, not overnight, but over time. And it is worth capturing this, as brief headlines both for the satisfaction of seeing the journey and as evidence for an inspection.

The above should be triangulated with data on progress for all groups of students. And from this it is possible to reach a judgement about the quality of teaching in the school.

A brief summary including impact on the above should be included in the school’s SEF so that is highlighted for the inspection team. Time on inspections is very tight, so if there is a clear brief summary of practice and headline impact it is more likely to be picked up and acknowledged by the team.

None of this is a quick fix. It is about long standing practice encouraged and nurtured over time. And not a clipboard in sight. Schools working to these principles are likely to be on the road to good or better.

You can find out more about Lesson Study through TDT’s National Teacher Enquiry Network

4 Comments

drmattoleary on 23 January, 2014 at 5:16 pm

I agree wholeheartedly with the point about the importance of triangulation in making judgements about the the effectiveness of teaching and learning Mary. This echoes one of the key findings to come from a recent project I completed (November 2013) on behalf of the University and College Union (UCU) into the use of lesson observation in the FE sector, which is the largest study of its kind into observation in the UK to date (involves up to 4000 participants). What emerged very clearly from the project’s data was the need for a more complex and sophisticated multi-dimensional model of teacher assessment including observations of practice, learner feedback and evaluations, teacher self-evaluations, peer review, learner completion and attainment rates etc. There was an overwhelming discontent amongst the project’s participants regarding the reliance on annual graded observations as the main or even sole form of evidence in some cases for assessing teacher competence and performance.

It’s interesting that you make reference to the use of lesson study and the importance of engaging with educational research in the context of triangulating the data used to assess overall effectiveness. Whilst these are, of course, important elements in developing a more rounded understanding of how mechanisms like observation can contribute to our understanding of the complex processes of teaching and learning, they are rooted in very different purposes and ideologies than the accountability ethos that underpins an inspectorate like Ofsted.

I have argued quite recently that one of the fundamental obstacles to maximising the use of a valuable tool such as observation is the fact that many in the education sector have become blinkered into viewing its primary/principal use as a form of assessment. Until we are able to cast off these blinkers then its potential as a powerful tool for educational inquiry and continuous teacher learning will remain marginalised. And if we’re going to be honest here, it is the reductive approach to assessment taken by agencies like Ofsted that will only serve to exacerbate this blinkered view of lesson observation.

Following on from this, I was intrigued by the comment you made that

‘It must be made clear that these are not judgements about an invidivual’s overall practice, but based on what was seen during the observation’

I understand that you’ve got your inspector’s hat on here and are echoing the ‘party line’. Unfortunately, there is an inconsistency between the rhetoric and reality. And what I mean by that is the grade has a longevity attached to it that belies the snapshot period in which it was awarded.
Reply
John Pearce (@JohnPearce_JP) on 24 January, 2014 at 3:30 pm

It’s good to see the range of evidence Ofsted considers is widening from “my time” (one of the first to be trained in early 1990s. The danger is many teachers’ and heads’ experience remains negative and has led to a recent flurry of anti-grade blogs and articles. There is a danger of a false polarity here: To Grade=Bad or To not Grade=Good. I want to to be ageist here as I have dedicated my 44 years in this profession to the approach so ably championed by professional developers. But Ofsted and accountability will not go away. We have to marry the coaching, professional development model with the requirement to match criteria – with rigour. We have to find a way of conducting our emotionally intelligent, collaborative inquiry, action research (call it what you will) and ALSO being prepared to challenge bad decisions/judgements by people who look and judge inadequately (pun intended). In other words – All teachers have to speak fluent Ofsted too! Put at its most negative I used to say get your revenge in first to teachers in SpM schools but that perpetuated a bad metaphor (more about bad metaphors later) More positively… that is why I designed The iAbacus which STARTS with the teacher’s judgment and , in dialogue, moves on to verif it and analyse and then plan action…. Please don’t reply that this is a product placement – that is another false polarity Public Service = Good v Business = Bad.. Just have a look at BETT2014 Award finalist see http://www.iabacus.co.uk for the MODEL that drvies it and the TOOL that amkes it work. We are new, we are young (well one is) and we are free…

PS Can we also stop those silly metaphors when we talk about “Weighing pigs and rooting up plants” not allowing them to develop. In real life of course we grow pigs and plants and are happy to see them flourish but when they don’t we do weigh and analyse in order to find out why? and What can be done? And when they die or grow stronger we also want o know. Just like schools the key difference is that, unlike pigs and plants, teachers and students can talk and so we can get into dialogue with them..so let the best triangulation be dialogue and “validated self-evaluation”. This will be my next BLOG when I make time John
Reply
Terry Pearson on 14 February, 2014 at 2:05 pm

Hello Mary. I am pleased you have produced this blog for James. I think you provide some very useful stimulation for thinking about and practicing lesson observation. Nevertheless, I can’t help but wonder if you have missed the fundamental point about Rob Coe’s presentation and subsequent discussion at the event mentioned in this blog.

I am pretty sure Rob is not advocating that “it is not possible to judge accurately the quality of teaching from a twenty minutes lesson observation.” He is more concerned about whether it is possible at all to judge accurately the quality of teaching through lesson observation, particularly with one-off observations, irrespective of the length of them. It is the process of observing, the complexity of teaching situations and the use of one-off data collection methods that give rise to the low levels of reliability and validity of judgemental lesson observations. The period of observation produces some, but only minor effects except in relation to representativeness. Indeed, the observations that were used in the MET study to which Rob frequently refers were only fifteen minutes long.

The conclusion that can be drawn from the research that underpinned Rob’s contribution is that it is really questionable whether the model of lesson observation used by Ofsted can be relied upon to generate trustworthy data to support judgements made during the observation about any aspect of what was observed. Lesson observation may be a useful way of gathering evidence of what the observer noticed but it cannot be used to reach a dependable judgement about what was seen. A well trained observer may be able to offer an enlightening and perhaps fuller, or incisive, description of aspects of the teaching situation but it is unlikely that they can tender accurate and undeniable judgements of the situation. You are so right in respect of this to point out that inspectors “[should] only make a note of evidence if they find it.”

In order to arrive at the best informed judgement of the extent to which what was noticed in a lesson may be influencing the overall quality of teaching and learning you rightly remind us that evidence from different sources needs to be triangulated with that obtained from the lesson observation. The difficulty with the Ofsted approach to triangulation is that it doesn’t do this very well. It doesn’t set out to triangulate evidence; instead it attempts to triangulate judgements. This is one of the major problems with using evidence recording instruments and a methodology that encourages inspectors to record judgements as well as evidence. For triangulation to work most effectively and robustly only evidence should be supplied to the decision making process. If judgements are included and used to inform further judgement then there is a higher probability that the process is more open to error through bias. Moreover if some of those judgements have been generated through a particularly low reliable and valid method of data collection then they are particularly prone to jeopardising subsequent judgements.

For a long time now, as you indicate, Ofsted has endeavoured to promote a stance that “It must be made clear that these are not judgements about an individual’s overall practice, but based on what was seen during the observation.” Yet after twenty years the effect of this stance has been marginal. It is highly likely that this due to the premise that all decisions and actions taken during lesson observation are psychologically, socially and politically constructed. The grade awarded by an inspector to what was seen during an observation will always follow the teacher. I can appreciate that it is impossible to disentangle the relationship between the two without removing grades from the system.

Numerous studies and participant’s reactions to, snapshot, one-off, judgemental lesson observation show that it is not a desirable means of gathering evidence for use in the high stakes arena of Ofsted inspection. It is high time it was removed from Ofsted methodology. I would argue that there is a place for lesson observation in Ofsted’s approach, but not of the type that is currently being used.
Reply
Terry Pearson on 17 February, 2014 at 10:34 pm

I am pleased Mary has provided this blog. I think it includes some very useful stimulation for thinking about and practicing lesson observation. Nevertheless, I can’t help but wonder if Mary has missed the fundamental point about Rob Coe’s presentation and subsequent discussion at the event mentioned in this blog.

I am pretty sure Rob is not advocating that “it is not possible to judge accurately the quality of teaching from a twenty minutes lesson observation.” He is more concerned about whether it is possible at all to judge accurately the quality of teaching through lesson observation, particularly with one-off observations, irrespective of the length of them. It is the process of observing, the complexity of teaching situations and the use of one-off data collection methods that give rise to the low levels of reliability and validity of judgemental lesson observations. The period of observation produces some, but only minor effects except in relation to representativeness. Indeed, the observations that were used in the MET study to which Rob frequently refers were only fifteen minutes long.

The conclusion that can be drawn from the research that underpinned Rob’s contribution is that it is really questionable whether the model of lesson observation used by Ofsted can be relied upon to generate trustworthy data to support judgements made during the observation about any aspect of what was observed. Lesson observation may be a useful way of gathering evidence of what the observer noticed but it cannot be used to reach a dependable judgement about what was seen. A well trained observer may be able to offer an enlightening and perhaps fuller, or incisive, description of aspects of the teaching situation but it is unlikely that they can tender accurate and undeniable judgements of the situation. Mary is so right in respect of this to point out that inspectors “[should] only make a note of evidence if they find it.”

In order to arrive at the best informed judgement of the extent to which what was noticed in a lesson may be influencing the overall quality of teaching and learning Mary rightly reminds us that evidence from different sources needs to be triangulated with that obtained from the lesson observation. The difficulty with the Ofsted approach to triangulation is that it doesn’t do this very well. It doesn’t set out to triangulate evidence; instead it attempts to triangulate judgements. This is one of the major problems with using evidence recording instruments and a methodology that encourages inspectors to record judgements as well as evidence. For triangulation to work most effectively and robustly, only evidence should be supplied to the decision making process. If judgements are included and used to inform further judgement then there is a higher probability that the process is more open to error through bias. Moreover if some of those judgements have been generated through a particularly low reliable and valid method of data collection then they are particularly prone to jeopardising subsequent judgements.

For a long time now Ofsted has endeavoured to promote the stance that Mary mentions, which is that “It must be made clear that these are not judgements about an individual’s overall practice, but based on what was seen during the observation.” Yet after twenty years the effect of this stance has been marginal. It is highly likely that this due to the premise that all decisions and actions taken during lesson observation are psychologically, socially and politically constructed. The grade awarded by an inspector to what was seen during an observation will always follow the teacher. I can appreciate that it is impossible to disentangle the relationship between the two without removing grades from the system.

Numerous studies and participant’s reactions to, snapshot, one-off, judgemental lesson observation show that it is not a desirable means of gathering evidence for use in the high stakes arena of Ofsted inspection. It is high time it was removed from Ofsted methodology. I would argue that there is a place for lesson observation in Ofsted’s approach, but not of the type that is currently being used.
Reply

Trackbacks/Pingbacks

Beyond lesson observation grades? | Teacher Development Trust | The Echo Chamber - […] via Beyond lesson observation grades? | Teacher Development Trust. […]
On Grading Lesson Observations | HuntingEnglishHuntingEnglish - […] Professor Coe presents the evidence base here. David Didau composes a cogent argument for learning over performance here and…
The Reading List: posts and links that have interesetd and helped us this week | prawnseyeblog - […] 14. What now for lesson observations? From @Mary Myatt: http://www.teacherdevelopmenttrust.org/beyond-lesson-observation-grades/#comment-289 […]

Beyond lesson observation grades?

4 Comments

Trackbacks/Pingbacks

Submit a Comment Cancel reply

Recent Blog Posts