Tuesday, 6 October 2015

The Grim Fairy Tale of Teacher Assessment

It occurred to me recently that if statutory teacher assessment were a fairy tale it would be Cinderella. Loved and cherished in the Early Years, its future has become ever more uncertain, its status steadily undermined. Soothed by masters who, publicly at least, pay lip service to its value, teacher assessment is becoming increasingly sidelined, abused, neglected and ignored. One wonders if ultimately it will be ditched altogether. These are dark days indeed for teacher assessment.

Early Years: the warning signs
The foundation stage profile as a statutory entity has a year to go and the government has introduced a baseline assessment for future progress measures, which many schools have already carried out this term. A choice was offered and schools voted overwhelmingly for Early Excellence because it fitted the ethos of the EYFS: teacher assessment based purely on observation. But many schools opted for NFER or CEM (having possibly opted for one of the others before they were removed from the list) so we have a fragmented approach, and I question if this is the DfE's ideal situation. On the one hand, they want to offer schools a choice (and allow the market to decide the best approach) but deep down would they have preferred a single test that provides a standardised baseline? I'm intrigued to see what happens when the first cohort of children get to the end of key stage 1. What if analysis shows that there's no apparent relationship between the baseline assessment and key stage 1 results? This would surely mean that the baseline is a poor predictor of outcome, a critical stipulation of the consultation (see p7). What then? Could they be ditched and replaced? I guess we'll find out in 3 years time.

Key Stage 1: The abuse begins 
Remember the performance descriptors? Below national standard, working towards national standard, working at national standard, and mastery. There was a consultation, there was opposition, there was silence for months, and then there were three - working towards, working within, and working at greater depth within the expected standard - contained in a sparse 11 page document. All that time to come up with those? Is it an improvement? OK, there was an apology, but there is also the intriguing use of the word 'interim'. These teacher assessment are interim and they are for 2016 only. What happens after that? And what happened to 'below'? What of those pupils that do not meet the criteria of 'working towards the expected standard'? How do we assess them?

If that isn't worrying enough, the DfE's response to the consultation on primary school assessment and accountability states that 'at the end of key stage 1, teacher assessment in mathematics and reading will be informed by externally-set, internally-marked tests. There will also be an externally-set test in grammar, punctuation and spelling which will help to inform the teacher assessment of writing.' Informed. Is this a polite way of saying validated? Or straitjacketed? Does this mean that the teacher assessment can only be X if the test score is Y? 

I read Daisy Christodoulou's blog in support of tests recently. It contained the following interesting point:

'Similarly, one way we could ensure greater equity in the early years is to introduce exams at KS1, rather than teacher assessments, since we have some evidence that teacher assessments at this age are biased against pupils from a low-income background – but again, if you suggest replacing teacher assessments with tests, you generally do not get a great response.'

Fair point. And she's right of course, it wouldn't get a great response but one wonders how much support such a position gets behind closed doors. And come to think of it, isn't that what's actually happening anyway? Next year's key stage 1 tests will provide scaled scores linked to an expected standard; and these scaled scores will 'inform' the relevant teacher assessment of which there will be one of just three possible outcomes. It sounds like the tests are winning to me.

Key Stage 2: Willful neglect

Now things get worse for dear teacher assessment. Reading, maths and science are reduced to simple binary outcomes. Are they working at the expected standard? Yes or no. Writing has been reduced from five possible outcomes (same as the originally proposed key stage 1 performance descriptors with 'above' shoehorned in between 'meeting the expected standard' and 'mastery') down to three as per key stage 1. Just like key stage 1 there is no teacher assessment for GPS, but unlike at key stage 1, the result cannot be used to 'inform' the teacher assessment because the tests are externally marked, which probably explains why for all subjects other than writing, the teacher assessment is binary. The yes or no response. In what way is this useful to anyone?

The STA's timetable of progress measures states that in academic years 2019/20 and 2020/21, progress will be measured from 'new' KS1 teacher assessment to 'new' KS2 test and teacher assessment outcomes. Call me a cynic but I don't believe it. I don't believe anyone would choose to use the vague key stage 1 teacher assessments as a baseline when a scaled score is a available. And while we're on the subject, I wouldn't be at all surprised if GPS test score usurps the writing teacher assessment at the key stage 2 end of the measure. For VA you want data to be as fine as possible, and a 3 tier teacher assessment hardly cuts it. At key stage 2, teacher assessment is looking seriously endangered.

Key Stage 3/4: Locked in the attic

Baselines for new Progress 8 measure at key stage 4 will involve a decimal level derived from pupils' English and maths results at key stage 2, where English is a combination of reading test result and writing teacher assessment. From 2017 onwards, once the last cohort with overall key stage 2 English levels have left, reading and mathematics test results only will be used in calculating key stage 2 prior attainment fine levels for use in progress 8. Writing will not feature. For those pupils missing key stage 2 test results, the teacher assessment will only be used in certain circumstances. In most cases where a pupil is missing one result, the teacher assessment will not replace it. Instead, the pupils' baseline will involve the one test result that is present. Here the importance of the key stage 2 teacher assessment has not so much been undermined as completely demolished.

The End

I think this is as far as the Cinderella analogy goes. I do try to be optimistic but I can't see any cause to be so here. From the potential mess of the fragmented reception baseline to the near total exclusion of teacher assessment from progress 8 baselines, and the 'interim' frameworks in between, the future looks bleak. For all the positive noises about the importance of professional judgement, teacher assessment at all key stages has been progressively marginalised to the point it is a shadow of its former self, and I'm not sure this particular fairy tale will have a happy ending.


  1. I'll be glad to see it go, however. In spite of all the rationale behind it - and I do accept that there are considerable arguments for using TA - it is still fundamentally flawed if it is used for actual comparable data. It has been a massive burden that we have never been able to properly shoulder, and a source of much stress. Our school, in its attempts to not be seen to inflate results, has I believe, always underestimated pupils' writing through TA. After years of trying to point this out, I have finally been recognised as having probably been right, but without a proper external yardstick, it was hard to convince people. Although tests have their limitations - at least everyone can get the same ones.

  2. Thanks for this. I wrote the post from a TA-leaning point of view but I do agree with you on the issue of comparable data. I'm a governor of a Junior school and have blogged a great deal about work I've done with Juniors schools around the country, essentially trying to mitigate the issues of inaccurate baselines by introducing on entry CATS tests, which provide an alternative VA estimate. The results are interesting. I also wonder if the opposition to simple tests for the reception baseline were over the top. Some schools certainly think so but evidently many disagree. I suppose if we end up with less workload and more accurate, reliable and comparable data then that's a good thing. I just hope this doesn't result in the whole process of TA being undermined to the point where schools are administering tests every term. We'll see.

  3. This comment has been removed by the author.

  4. This comment has been removed by the author.

  5. The most prolific activity related to all aspects of Internet marketing today is… writing. I’ve often wondered exactly how many people fully understand that statement. So let’s take a brief look at what it actually means. click to read