In defence of criticisms of UCU’s graded lesson observation report

Former Ofsted FE and skills inspector Phil Hatton was critical of a report from University of Wolverhampton academic Dr Matt O’Leary that raised “serious questions about the fitness for purpose” of graded lesson observations. This is Dr O’Leary’s response to Mr Hatton.

The University and College Union (UCU) research project into the use and impact of lesson observation in FE recently came in for some critique from Phil Hatton.

The project is the largest and most extensive of its kind carried out in the English education system and as such marks an important milestone in lesson observation research.

However, Mr Hatton seemed more intent on damning the report than seriously engaging with its key findings. This is disappointing but not surprising given that Mr Hatton seemed to have a particular axe to grind and, as it turns out, has not even read the report.

Mr Hatton describes himself as a ‘scientist’, yet there is a noticeable lack of empirical evidence or systematic argument in his article, much of which is based on personal anecdotes.

The fact that he dismisses the real experiences and views of thousands of UCU FE members displays a high level of contempt towards them. That he should also compare them to ‘turkeys’ voting for Christmas is an insult to the very serious issues raised in this research.

Whether or not he disagrees with the views of UCU members, to belittle them is disrespectful and irresponsible. It is clear Mr Hatton has not read the report in full and thus draws on his pre-established prejudices to support his argument, the antithesis of a ‘scientific’ approach.

Performance-driven observations are an extremely unreliable means of attempting to assess and measure professional competence

Mr Hatton takes issue with the representation of college managers in the study. The research sample included UCU members nationally. The fact that senior college managers comprised a small percentage of that sample is a reflection of the composition of UCU’s membership.

It has nothing to do with excluding a specific population group from the research, which Mr Hatton seems to imply in his comments.

A sample can only be drawn from the population in question. If Mr Hatton were to make the effort to read the report in full, he would indeed find there are numerous instances in which the views and voices of senior managers are included, often conflicting with those of teaching staff.

His comments suggest that he has little understanding of research methodology. If he did, then he would know that to reduce threats to the validity and reliability of any research, the methodology should be made explicit and transparent for all to see so that a judgement can be made on what data was collected, from whom, how it was collected and analysed etc.

Once again, had he read the report, he would realise that there is a section which discusses this in detail and is open to the external scrutiny of any reader.

Mr Hatton states: ‘I am very simplistic about my expectations of the FE system’. His simplistic position is not restricted to his expectations of FE, but extends to his conceptualisation of the way in which observation is used as a method and its role in informing judgements about professional competence.

In referring to a system of observation that he introduced at a college where he was responsible for managing quality, he conflates the use of grading performance with ‘identifying and spreading good practice’ as though this was something that is unproblematic and uncontested, let alone the disputed notion of what constitutes ‘good practice’.

However, in his defence, he does state that this was 18 years ago.

Times have certainly changed considerably since then and the failure to acknowledge the increasingly high stakes nature of graded observations in FE is merely one example of how out of touch he appears to be with the current debate.

His claim that ‘if you cannot put on a performance with notice, there has to be something very lacking in your ability’ is very revealing about Mr Hatton’s views of the purpose of observation.

He is right about associating the use of summative observations with ‘performance’. A key theme to emerge from the research data was the inauthenticity of the performance element of isolated, episodic observations.

There were repeated examples of ‘poor’ teachers raising their game for these one-off observations, only to go back to their poor practice for the rest of the year.

In contrast, some consistently effective teachers were so unnerved by these high stakes observations that they seemed to go to pieces during the observed lesson.

Thus the important lesson here is that performance-driven observations are an extremely unreliable means of attempting to assess and measure professional competence.

His final claim that ‘the best way of gauging the quality of the experience of learners is to observe what they are getting in a quantitative way, in a transparent way’ would seem a commendable suggestion, but it is one that belies the complexities of teaching and learning and seeks to measure them in a reductive and ultimately unreliable manner.

Let us continue to use observation to inform judgements about the quality of teaching and learning, along with other sources of evidence.

But let us also acknowledge its limitations and accept that the grading of performance is little more than a pseudo-scientific practice that gives rise to some very serious counterproductive consequences for the well-being of staff.

Dr Matt O’Leary, principal lecturer and research fellow in post-compulsory education at the University of Wolverhampton’s Centre for Research and Development in Lifelong Education (CRADLE), and author of Classroom Observation: A Guide to the Effective Observation of Teaching and Learning

Latest education roles from

Secondary Higher Level Teaching Assistant

Secondary Higher Level Teaching Assistant

Ark John Keats Academy

Learning Support Assistant – Enhance

Learning Support Assistant – Enhance

MidKent College

Procurement Officer

Procurement Officer

RNN Group

Director of Marketing and Student Recruitment

Director of Marketing and Student Recruitment

Barnet and Southgate College

Professional Practice (TLA) Lead

Professional Practice (TLA) Lead

RNN Group

Principal & Chief Executive Officer

Principal & Chief Executive Officer

Stoke on Trent College

Your thoughts

Leave a Reply

Your email address will not be published. Required fields are marked *


  1. Matt, firstly, I did read your report (all 104 pages of it on a train from York to London) so please do not make an assumption and write your opinion that I did not read it as if it were a fact. While defending your report against anyone who would dare to question it, you have failed to pick out some of the key points of my piece, such as the impact of the change in grade 3 descriptor on the process of graded observations, and that in reality, there are now only two grades that matter (something I would have included in such a ‘milestone’ research piece – as it has had such a negative impact on observation). My reason for being critical of the research methodology is that the main target sample was just one of several teaching unions, who as a body are opposed to graded observations, and of whom just over 11% replied to your survey. That is a very low return for a survey, and one I would not consider statistically valid to extrapolate to reflect the views of approximately 32,000 members, and certainly not the FE sector as a whole. That is not me dismissing their views as you wrongly said (you seem to like expressing views for others), but my expressing of an opinion that the sample was not sufficiently reflective of the sector. In analysing the group that did reply, you asked how long they had been in teaching, but not what they last received as a ‘rating’ for themselves when being subject to a graded observation. Now that just might have been a variable with a major impact on your interpretation of the data to arrive at your findings?

    Some of the statistics quoted were baffling and would have been forbidden to be used by the statisticians who have checked research I have written. In question 11, where feedback was sought about the ‘usefulness of feedback to improve teaching by identifying areas for improvement’ (but curiously not strengths to build upon), there was a high number who did not respond and very high numbers responding to expressing a view for both the choice of graded and ungraded observation, suggesting large numbers undertook both in their colleges (unlikely), or that an insufficiently clear question was set? Also, were the 427 who skipped that question those with outstanding practice who might be asked to help spread it (because that was an area not really asked about in the survey)? The use of the term ‘quality assurance’ (which has a distinct negative connotation and became defunct some time ago) was also used in some questions rather than ‘quality improvement’.

    I do know a little about the sector, and your report has not sufficiently captured the good practice of many colleges in working with staff to focus on improving individual teachers in a supportive way to benefit their learners. There is some great practice out there that is not being shared. Try reading what I said without being overly defensive about your report methodology and you might even agree with (some of) the points in it:

    ‘Giving feedback to someone who is doing a reasonable job of teaching, but saying they require improvement rather than being satisfactory is a world apart, both during inspection and as part of internal quality improvement’

    ‘Getting the way an observation system is viewed in a college right requires consultative management, not focusing on labelling people as a particular grade of teacher that somehow then defines them, but on a shared purpose of getting the overall package of course delivery to “good”.’

  2. I consider OFSTED-style inspections to be idiotic and all engaged in implementing them to be self-serving tools of management and government. I don’t teach now, thank God, but when being observed by such minions I used to deliberately aim for a Grade 3, just to avoid any unpleasant repercussions such as having to be reobserved. I got this down to something of a fine art! I know I was an excellent and popular teacher, but I refused to put on a special show and jump through hoops for so-called ‘inspectors, who I usually suspected were probably not as good teachers as me on a day-to-humdrum-day. The only opinions about my teaching that I ever had any respect for were those of my students [FE/adult] and from them I always got a 99% approval rating. Well, nobody’s perfect, not even me, and you can’t please everybody all the time! But I never wanted to please so-called ‘inspectors’ any of the time …
    P.S. Yes, I’m a fully-paid-up member of both the Awkward Squad and the Lecturers’ Revolutionary Anti-Fascist Brigade.

    • Angela

      I was observed by a non teacher head of department who apparently had undertaken CPD in tick boxing, she had not taught possibly ever in her life but had got a job apparently after doing community or adult teaching … rumour – another person had a NVQ in childcare … and observing someone in a very highly specialist field with post graduate quals. Needless to say a friend of the head got a 1 – even though she spelt words incorrectly on her coloured sticky post it notes and I got a bad grade which she could not justify nor explain when I challenged it. She refused to put her grade in writing also. I then was ordered to go along to some teacher learning ‘expert’ who taught sports but was not involved in professional sports herself, who again I was more qualified and experienced to teach not be taught by and she suggested I do little warm up/starter intro games for 16+ learners such as bean bag throwing … what?

  3. Angela

    Before you pre ofsted and internal Ofsted others – you should also teach, and not just teach the well behaved, wanting to learn students but the ones who are there to avoid starvation and non receipt of benefits. All management should teach at least a third of the timetable as part of their CPD. And not at the end of the year – at the beginning of the year please.

  4. Angela

    Can someone comment on this – if a principal goes into a college and trashes the college to a low grade – do they

    a) get big pay rises – say mere £45,000 on top of £120,00 wage if the college ups a grade on next inspection

    b) get big pay rises everytime the college ups a grade on every subsequent inspection

    c) get a pay cut – for whatever reason.

    if it pays to get a low grade and subsequently improve … does anyone investigate this kind of caper – and if certain managers/principals/heads seem to attract low grades everywhere they go – which suddenly improve… does this attract investigation and if not, why not?

  5. Matt O'Leary

    Dear Phil, it’s good to see you’ve made the effort to engage with the report more substantively. There are certainly some points we agree on, your last comments (I.e. the impact of Ofsted’s shifting of the goal posts of satisfactory to RI, the impact of labelling individual teachers & the importance of any observation scheme to be negotiated and created in consultation with staff) are all a case in point and actually support some of the report’s findings.

    My work is always open to external scrutiny and I welcome informed debate. I’m never precious about anything I write but I do challenge unsubstantiated assertions. You mention that you have carried out research. As someone who has been researching & writing about observation in the FE sector for the last decade, I am familiar with all the research about lesson observation but have never come across any research by you on the topic. Of course, it’s possible that I may have missed it so now would be a good opportunity to share it in a public forum. I’ve argued for some time that it’s an underresearched area so additional studies would be welcomed.

    I agree that there are now only 2 grades that ‘matter’ under the Ofsted 4-point scale – all the more reason for moving towards a pass/fail or ‘fitness to practice’ type judgement rather than the crude, unreliable numerical scale that currently dominates.

    In response to some of your specific comments about the UCU research. Firstly, with regards to the target sample, it is important to stress that the research was commissioned by UCU FOR UCU members. The report’s findings are not presented as a reflection of the views and experiences of the whole FE sector. How different media sources choose to interpret and report the findings is not for me to decide. If anyone wants to commission a study that seeks to incorporate other unions and professional bodies then I’m happy to discuss that.

    Secondly, with regards to the response rate. I openly acknowledge that it is low. As anyone involved in quantitative research knows, the issue of what is deemed ‘statistically significant’ remains contested. Whether or not you choose to rely on a ‘textbook’ definition of statistical significance, the responses of just under 4000 participants is not an insignificant number and needs to be considered carefully. I find it ironic that you deny not dismissing participants’ views but choose to refer to them as ‘turkeys voting for Christmas’. The use of such disparaging discourse is illustrative of ‘confirmation bias’ I.e. You don’t agree with their views and thus immediately dismiss them.

    Thirdly, your suggestion about using the grades participants received as part of their last graded observation as a variable for analysis is bizarre. Given the question marks about the reliability of grading in observations as a whole (see here and here for further discussion), the idea that such a variable could be deemed valid is a non-starter. Sports science colleagues of mine who only conduct quantitative research comment that the use of such a subjective variable would be non-sensical.

    Fourthly, the notion that the anyone can make a judgement about the 427 participants who skipped question 11 is completely unfounded and entering the realm of conspiracy theory. A researcher can only analyse the recorded responses and any assumption about the who/why of non-response is exactly that, an assumption.

    I completely agree with you that there is some excellent practice going on out there in the FE sector. As someone who works with staff in colleges across the whole country on a daily basis, I’m consistently buoyed by the excellent work that exists. Unfortunately, looking at the qualitative data in the report, the counterproductive consequences of current approaches to observation far outweighed its use and benefits as a formative tool. That’s what the data revealed and is reinforced by the recent OECD TALIS report (2014). Believe you me, there is no one more than me who would like to see that balance reversed as I am convinced that observation has a vital role to play in enhancing our understanding of the processes of teaching and learning.

  6. Eunice

    I left FE as I was disgusted by the observation system amongst other issues. I was observed by staff who were not qualified in teaching or my subject area. An absolute joke.

  7. K Heaney

    My last observation was a really bad experience. I had just returned to work after the death of my husband. I was observed by 2 senior managers. The one who wrote the report failed to acknowledge that lesson was a group tutorial session, and therefore ‘marked’ me down for not drawing out teaching points. She also lied saying that I hadn’t got a scheme of work. Consequently I got a poor grade. When challenged she altered the tick box from taught session to tutorial but everything else had to remain the same. Upon challenging this I found this automatically put me into a disciplinary process. Needless to say, this made me ill and I am now on long term sick leave.

    How did this process help my students, or me?