Wednesday, 25 October 2017

MATs: monitoring standards and comparing schools

A primary school I work with has been on the same journey through assessment land as many other schools up and down the country. Around two years ago they began to have doubts about the tracking system they were using - it was complex and inflexible, and the data it generated had little or no impact on learning. After much deliberation, they ditched it and bought in a more simple, customisable tool that could be set up and adapted to suit their needs. A year later and they have an effective system that teachers value, that provides all staff with useful information, and is set up to reflect their curriculum. A step forward.

Then they joined a MAT.

The organisation they are now part of is leaning on them heavily to scrap what they are doing and adopt a new system that will put them back at square one. It's one of those best-fit systems in which all pupils are 'emerging' (or 'beginning') in autumn, mastery is a thing that magically happens after Easter, and everyone is 'expected' to make one point per term. In other words, it's going back to levels with all their inherent flaws, risks and illusions. The school tries to resist the change in a bid to keep their system but the MAT sends data requests in their desired format, and it is only a matter of time before the school gives in.

It is, of course, important to point out that not all MATs are taking such a remote, top down, accountability driven approach, but some are still stuck in a world of (pseudo-) levels and are labouring under the illusion that you can use teacher assessment to monitor standards and compare schools, which is why I recently tweeted the following:


This resulted in a lengthy discussion about the reliability of various tests, and the intentions driving data collection in MATs. Many stated that assessment should only be used to identify areas of need in schools, in order to direct support to the pupils that need it; data should not be used to rank and punish. Of course I completely agree, and this should be a strength of the MAT system - they can share and target resources. But whatever the reasons for collecting data - and lets hope that its done for positive rather than punitive reasons - let's face it: MATs are going to monitor and the compare schools and usually this involves data. This brings me back to the tweet: if you want to compare schools, don't use teacher assessment, use standardised tests. Yes, there may be concerns about the validity of some tests on the market - and it is vital that schools thoroughly investigate the various products on offer and choose the one that is most robust, best aligned with their curriculum, and will provide them with the most useful information - but surely a standardised test will afford greater comparability than teacher assessment.

I am not saying that teacher assessment is always unreliable; I am saying that teacher assessment can be seriously distorted when it is used for multiple purposes (as stated in the final report of the Commission on Assessment without Levels). We need only look at the issues with writing at key stage 2, and the use of key stage 1 assessments in the baseline for progress measures to understand how warped things can get. And the distortion effect of high stakes accountability on teacher assessment is not restricted to statutory assessment; it is clearly an issue in schools' tracking systems when that data is not only used for formative purposes, but also to report to governors, LAs, Ofsted, RSCs, and senior managers in MATs. Teacher assessment is even used to set and monitor teachers' performance management targets, which is not only worrying but utterly bizarre.

Essentially, using teacher assessment to monitor standards is counter productive. It is likely to result in unreliable data, which then hides the very things that these procedures were put in place to reveal. And even if no one is deliberately massaging the numbers, there is still this issue of subjectivity: one teacher's 'secure' is another teacher's 'greater depth'. We could have two schools with very different in-year data: school A has 53% of pupils working 'at expected' whereas school B has 73%. Is this because school B has higher attaining pupils than school A? Or is it because school A has a far more rigorous definition of 'expected'?

MATs - and other organisations - have a choice: either use standardised assessment to compare schools or don't compare schools. In short, if you really want to compare things, make sure the things you're comparing are comparable.


13 comments:

  1. Everything you write here makes a great deal of sense, but as a member of the assessment working group at our MAT, I'm finding a common issue with standardised tests which is causing some concern for those teachers who are trying to engage with scores instead of teacher assessment.
    Standardised scores of individual pupils can vary significantly between tests and we must be very careful to suddenly jump to conclusions about one child's progress based on a lower or higher score than they achieved previously. I've spoken to our school improvement partner, who has tracked scores in the past, and she says this is a typical feature of using standardised testing. Daisy Christodoulou also warns against making decisions about individuals based on a single test score in Making Good Progress.
    If we acknowledgement this weakness of standardised testing and continue to point out that using an average score of a cohort to compare them against other cohorts or against their own historical performance, then standardised testing has a great deal more chance of becoming the norm and replacing teacher assessment as the best way to create summative judgements.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete
    2. Yes. Fair point I think I allude to this in the blog. I’ve had similar conversations recently with Becky Allen, but I get the impression that some are more reliable/less noisy than others. We need to find the more reliable ones and ditch the noisy ones. Problem is that some don’g test enough in enough detail. There is also the problem of 3 times per year tests not necessarily testing what a school has covered (probably more an issue in maths). I think the best approach may be a single test at the end of the year, or more regular computer adaptive tests that concentrate on specific topics. But anyway, yes, I completely take your point. Thanks

      Delete
    3. Also, in terms of MAT level reporting, they will most likely be looking at changes in average scores over time, and comparing average scores of one group against another to monitor gaps. I think that what you suggest would/should naturally be done by MATs at the top level.

      Delete
    4. Yes - you're right about only groups being of interest to MATs, but teachers need to trust the system. We can easily develop this trust though by continuously pointing out that the validity of inferences we make about group data is higher than that of individuals, and that we shouldn't worry about fluctuations in pupil level results.
      I'd be really interested to see the content / outcome of your communication with Becky Allen - was it on a public platform I could look at?

      Delete
    5. Although this is true of all assessments, it is possible to obtain quite high reliability in standardised assessments. The old SATs had very high reliability between tests. Primary schools used to use them as practice and interim assessments and we had quite a few past papers to work with. There was a time when we were in the habit of giving one reading paper a week (I know!) and there was remarkable consistency between them. It takes time and trialling to achieve this, though.

      Delete
    6. No. Communication with Becky Allen not in public domain although outcome of research into reliability of various tests is due to be published this term I think.

      Delete
    7. This comment has been removed by the author.

      Delete
    8. And we can build trust with teachers by separating performance management from assessment. The purpose of assessment should be to support learning. That’s it. If that is the culture in a school then teachers will have more faith in it.

      Delete
    9. The main point is though that surely using standardised assessment to evaluate schools’ performance is a better option than basing it on how many boxes teachers have ticked off in a tracking system.

      Delete
    10. That's a very good point about PM.

      I'll keep an eye out for the reliability research.

      Delete
  2. This comment has been removed by the author.

    ReplyDelete
    Replies
    1. This comment has been removed by the author.

      Delete