Wednesday, 14 December 2016

10 things I hate about data

I seem to spend a lot of time ranting these days. Recently I've been trying to rein it in a bit, be less preachy. It's counter productive to wind people up - need to get them on side - the problem is there are just so many opportunities to get annoyed these days. I'm turning into a data analyst who hates data. Well, a data analyst who hates bad data (as any decent data analyst should). And let's face it, there's a lot of bad data out there to get annoyed about. So, a few weeks ago I gave a conference talk entitled '10 things I hate about data' (it could have been much longer, believe me, but 10 is a good number).

Here's a summary of that talk.

1) Primary floor standards
We are now in a crazy world where the attainment floor standard is way above national 'average'. England fell below its own minimum expectation. How can that happen? On the 1st September 2016, the floor standard ceased to be a floor standard and became an aspirational target. But the DfE had already committed to having no more than 6% of schools below floor, which meant they had to set the progress thresholds so low that they just captured a handful of schools. I find it hard to apply the phrase 'sufficient progress' to scores of -5 and -7 and keep a straight face. So primary schools have four floor standards: one linked to attainment, which is way too high, and three relating to progress, which are way too low. If the school is below 65% EXS in reading, writing and maths, and below just one of the progress measures it is below floor. Unless that one happens to be writing, in which case chances are it'll be overlooked because writing data is junk. Oh, and if you are below just one progress floor then it has to be significantly below to be deemed below floor, which is ridiculous because it's not actually possible to have scores that low and for them not to be significantly below. Meanwhile, secondary schools with all the complexity of GCSE and equivalent data, have one single measure, progress 8, which captures the average progress made by pupils in up to 8 subjects. The floor standard at KS4 is half a grade below. Simple. Why can't primary schools have a similar single, combined-subject, progress-based floor measure?

2) Coasting
I hate this measure. I get what they're trying to do - identify schools with high attainment and low progress - but this has been so badly executed. Why 85%? What does that mean? How does 85% level 4 in previous years link to 85% achieving expected standards this year? Why are they using levels of progress medians for 2014 and 2015 when they could have used VA, which would make the progress broadly comparable with 2016? And why have they just halved the progress floor measures? (smacks of what my Dad would describe as a 'Friday afternoon job'). Remember those quadrant plots in RAISE? The ones that plotted relative attainment (which compared the school's average score against the national average score) against VA? Schools that plot significantly in the bottom right hand quadrant 3 years running - that would be a better definition of coasting. Unless they are junior schools, in which case forget it. Actually, until we have some robust data with accurate baselines, perhaps forget the whole thing.

3) The use of teacher assessment in high stakes accountability measures
The issue of KS2 writing has been discussed plenty already. We know it's inconsistent, we know it's unreliable, we know it's probably junk. Will it improve? No. Not until teacher assessment is removed from the floor standards at least. I'm not saying that writing shouldn't be teacher assessed, and that teacher assessment shouldn't be collected, but we can't be surprised that data becomes corrupted when the stakes are so high. The DfE evidently already understands this - they decided a year ago not to use writing teacher assessment in the progress 8 baseline from 2017 onward (the first cohort to have writing teacher assessed at KS2). It's not just a KS2 issue either. KS1 assessments form the baseline for progress measures so primary schools have a vested interest in erring on the side of caution there; and now that the DfE are using EYFSP outcomes to devise prior attainment groups for KS1, who knows what the impact will be on the quality of that data. All this gaming is undermining the status of teacher assessment. It needs a rethink.

4) The writing progress measure
Oh boy! This is a whopper. If you were doubting my assertion above that writing teacher assessment should be removed from floor standards, this should change your mind. Probably best to read this but I'll attempt to summarise here. Essentially, VA involves comparing a pupil's test score against the national average scores for pupils with the same start point. A pupil might score 97 in the test and the national average score for their prior attainment group is 94, so that pupil has a progress score of +3. This is fine in reading and maths (and FFT have calculated VA for SPaG) but it doesn't work for writing because there are no test scores. Instead, pupils are assigned a 'nominal score' according to their teacher assessment: WTS = 91, EXS = 103, GDS = 113, which is then compared against an unachievable fine graded benchmark. So, a pupil in prior attainment 12 (KS1 APS of 15 i.e. 2b in reading, writing and maths) has to achieve 100.75 in writing, which they can't. If they are assessed as meeting expected standard (nominal score of 103) their progress score will be +2.25; if they are assessed as working towards (nominal score of 91) their progress score will be -9.75. Huge swings in progress scores are therefore common because most pupils can't get close to their benchmarks due to the limitations of the scoring system. And I haven't got space to here to go discuss the heinousness of the nominal scoring system applied to pre-key stage pupils except to say that it is pretty much impossible for pupils below the level of the test to achieve a positive progress score. So much the for claim in the primary accountability document that the progress measures would reflect the progress made by ALL pupils. Hmmm.

5) The death of CVA
In 2011, the DfE stated that 'Contextual Value Added (CVA) goes further than simply measuring progress based on prior attainment [i.e. VA] by making adjustments to account for the impact of other factors outside of the school’s control which are known to have had an impact on the progress of individual pupils e.g. levels of deprivation. This means that CVA gives a much fairer statistical measure of the effectiveness of a school and provides a solid basis for comparisons.' Within a year, they'd scrapped it. But some form of CVA is needed now more than ever. Currently, pupils are grouped and compared on the basis of their prior attainment, without any account taken of special needs, length of time in school, number of school moves or deprivation. This is a particular issue for low prior attainment groups, which commonly comprise two distinct types of pupils: SEN and EAL. Currently, no distinction is made and these pupils are therefore treated the same in the progress measures, which means they are compared against the same end of key stage benchmarks. These benchmarks represent national average scores for all pupils in the particular prior attainment group, and are heavily influenced by the high attainment of the EAL pupils in that group, rendering them out of reach for many SEN pupils. Schools with high percentages of SEN are therefore disadvantaged by the current VA measure and are likely to end up with negative progress scores. The opposite is the case for schools with a large proportion of EAL pupils. This could be solved by either introducing form of CVA or by removing SEN pupils from headline measures. This of course could lead to more gaming of the system in terms of registering pupils as SEN or not registering as EAL, but the current system is unfair and needs some serious consideration.

6) The progress loophole of despair
This is nuts! Basically, pupils that are assessed as pre-key stage are included in progress measures (they are assigned a nominal score as mentioned above), whereas those assessed as HNM (in reading and maths) that fail to achieve a scaled score (i.e. do not achieve enough raw marks on the test) are excluded from progress measures, which avoids huge negative progress scores. I have seen a number of cases this year of HNM pupils achieving 3 marks on the test and achieving a scaled score of 80. Typically they end up with progress deficits of -12 or worse (sometimes much worse), which has a huge impact on overall school progress. Removing such pupils often makes the difference between being significantly below and in line with average. And the really mad thing is that if those pupils had achieved one less mark on the test, they wouldn't have achieved a scaled score and therefore would not have been included in the progress measures (unlike the pre-key stage pupils). Recipe for tactical assessment if ever I saw one.

7) The one about getting rid of expected progress measures
The primary accountability document states that 'the ‘[expected progress] measure has been replaced by a value-added measure. There is no ‘target’ for the amount of progress an individual pupil is expected to make.’ Yeah, pull the other one. Have you seen those transition matrices in RAISE (for low/middle/high start points) and in the RAISE library (for the 21 prior attainment groups)? How many would really like to see those broken down into KS1 sublevel start points? Be careful what you wish for. Before we know it, crude expectations will be put in place, which will be at odds with value added and we're back to square one. Most worrying are the measures at KS1 involving specific early learning goals linked to end of KS1 outcomes, and the plethora of associated weaknesses splashed all over page 1 of the dashboard. Teachers are already referring to pupils not making 'expected progress' across from EYFS to KS1 on the basis of this data. Expected progress and VA are also commonly conflated, with estimates viewed as minimum targets. In every training session I've run recently, a headteacher has recounted a visit by some school improvement type who has shown up brandishing a copy of the table from the accountability guidance, and told them what scores each pupil is expected to get this year. Expected implies that it is prescribed in advance, and yet VA involves comparison against the current year's averages for each prior attainment group and we don't know what these are yet. Furthermore, because it is based on the current year's averages, half the pupils nationally will fall below the estimates and half will hit or exceed them. That's just how it is. Expected progress is the opposite of VA and my response to anyone confusing the two is: tell me what the 2017 averages are for each of the 21 prior attainment groups, and i'll see what I can do. I spoofed this subject here, by the way.

8) Progress measures in 2020
Again, this. Put simply, the basis of the current VA measure is a pupil's APS at KS1. How are we going to do this for the current Y3? How to I work out the average for EXS, WTS, EXS? Will the teacher assessments be assigned a nominal value? How many prior attainment groups will we have in 2020 when this cohort reach the end of KS2. Currently we have 21 but surely we'll have fewer considering there are now fewer possible outcomes at KS1, which means we'll have more pupils crammed into a smaller number of broader groups. Such a lack of refinement doesn't exactly bode well for future progress measures. Remember that all pupils in a particular prior attainment group will have the same estimates at the end of KS2, so all your EXS pupils will be lumped into a group with all other EXS pupils nationally and given the same line to cross. This could have been avoided if the KS1 test scores were collected and used as part of the baseline, but they weren't so here we are. 2020 is going to be interesting.

9) Colour coding used in RAISE
Here is a scene from the script I'm working on for my new play, 'RAISE (a tragedy)'.

HT: "blue is significantly below, green is sigificantly above, right?"
DfE: "No. It's red and green now"
HT: "right, so red is significantly below, and green is significantly above. Got it"
DfE: "well, unless it's prior attainment"
HT: "sorry?"
DfE: "blue is significantly below if we're dealing with prior attainment"
HT: "so blue is significantly below for prior attainment but red is significantly below for other stuff, and green is significantly above regardless. And that's it?"
DfE: "Yes"
HT: "You sure? You don't look sure."
DfE: "Well...."
HT: "well what?"
DfE: "well, it depends on the shade?"
HT: "what shade? what do you mean, shade?"
DfE: "shade of green"
HT: "shade of green?"
DfE: "or shade of red"
HT: "Is there a camera hidden here somewhere?"
DfE: "No. Look, it's perfectly simple really. Dark red means significantly below and in the bottom 10% nationally, light red means significantly below but not in the bottom 10%; dark green is significantly above and in the top 10%, light green is significantly above but not in the top 10%. See?"
HT: "Erm....right so shades of red and green indicating how significant my data is. Got it."
DfE: "Oh no. We never say 'how significant'. That's not appropriate, statistically speaking"
HT: "but, the shades...."
DfE: "well, yes"
HT: *sighs* "OK, shades of red and green that show data is significantly below or above and possibly in the bottom or top 10%. Right, got it"
DfE: "but only for progress"
HT: "Sorry, what?"
DfE: "we only do that for progress"
HT: "but I have dark and light green and red boxes for attainment, too. Look, here on pages 9 and 11 and 12. See?"
DfE: "Yes, but that's different"
HT: "How is it different? HOW?"
DfE: "for a start, it's not a solid box, it's an outline"
HT: "Is this a joke?"
DfE: "No"
HT: "So, what the hell do these mean then?"
DfE: "well those show the size of the gap as a number of pupils"
HT: "are you serious?"
DfE: "Yes. So work out the gap from national average, then work out the percentage value of a pupil by dividing 100 by the number of pupils in that group. Then see how many pupils you can shoehorn into the gap"
HT: "and the colours?"
DfE: "well, if you are 2 or more pupils below that's a dark red box, and one pupil below is a light red box, and 1 pupil above that's a light green box, and you get a dark green box if you are 2 or more pupils above national average"
HT: "and what does that tell us?"
DfE: "I'm not sure, but -2 or lower is well below, and +2 or higher is well above. You may have seen the weaknesses on your dashboard"
HT: "So let me get this straight. We have dark and light shades of red and green to indicate data that is either statistically below or above, and in or not in the top or bottom 10%, or gaps that equate to 1 or 2 or more pupils below or above national average. Am I there now?"
DfE: "Yes, well unless we're talking about prior attainment"
HT: "Oh, **** off!"

Green and red lights flash on and off. Sound of rain. A dog barks.

10) Recreating levels
We've been talking about this for nearly 2 years now and yet I'm still trying to convince people that those steps and bands commonly used in tracking systems - usually emerging, developing, secure - are essentially levels by another name. Instead of describing the pupil's competence in what has been taught so far - in which case a pupil could be 'secure' all year - they instead relate to how much of the year's curriculum has been achieved, and so 'secure' is something that happens after Easter. Despite finishing the previous year as 'secure' they start the next year as 'emerging' again (as does everyone else). Pupils that have achieved between, say, 34% and 66% of the year's curriculum objectives are developing, yet a pupil that has achieved 67% or more is secure. Remember those reasons for getting rid of levels? how they were best-fit and told us nothing about what pupils could or couldn't do; how pupils at either side of a level boundary could have more in common than those placed within a level; how pupils could be placed within a level despite having serious gaps in their learning. Consider these reasons. Now look at the the example above, consider your own approach, and ask yourself: is it really any different? And why have we done this? We've done it so we can have a neat approximation of learning; arbitrary steps we can fix a point score to so we can count progress even if it it's at odds with a curriculum 'where depth and breadth of understanding are of equal value to linear progression'. Then we discover that once pupils have caught up, they can only make 'expected progress' because they don't move on to the next year's content. So we shoehorn in an extra band called mastery or exceeding or above, with a nominal bonus point so we can show better than expected progress for the most able. These approaches have nothing to do with real learning; they've got everything to do with having a progress measure to keep certain visitors happy. It's all nonsense and we need to stop it.

Merry Christmas!

Friday, 2 December 2016

Example report on primary school performance

This is an anonymised version of a summary report I've just written for a primary school. They kindly allowed me to post it on my blog (with school name removed, obviously). It's not been properly edited yet but hopefully it'll give you some ideas if you are in the middle of writing similar reports at the moment.

Beacon Primary Academy
Summary of performance 2016

Beacon Primary Academy is a larger than average primary school in an area of high deprivation in central Springfield. Most pupils are from ethnic minority backgrounds – the majority are Pakistani or Bangladeshi - and English is an additional language for the majority of pupils in the school. Almost half of pupils are eligible for free school meals, which is considerably higher than national average, and the school ranks amongst the 20% most deprived nationally.

Historically, prior attainment at key stage 1 has been significantly below average but this has improved and prior attainment of current years 4 and 5 is broadly in line with national average.

Floor Standards
Floor measure
Progress Reading
Progress Writing
Progress Maths
Overall floor standards met?

The school fell below the attainment floor measure of 65% achieving the expected standard in reading, writing and maths combined but it is close to the national average of 53%. It should be noted that the majority of schools nationally fell below the 65% threshold. The school is above all 3 progress floor measures (it was significantly above average for progress in writing and maths) and is therefore above floor. Furthermore, the school is not considered to be ‘coasting’ on the basis of its 2016 progress results.

Key stage 2

FFT Analysis of main key stage 2 results
FFT data shows that the average score in reading and maths tests combined in 2016 was 100.5, which is significantly below average, but is 1.6 points higher, and significantly above expected, when pupil start points are taken into account. Furthermore, the result of 49% achieving the expected standard in reading, writing and maths, despite being below national average of 53%, is significantly higher than expected by 12% points, considering the start points of pupils. This indicates that, under normal circumstances, just 37% of pupils would achieve the expected standard in the three subjects, which translates into 10 more pupils achieving the overall expected standard than would be expected to do so in a school where ‘average’ progress is made. FFT ranks the school in the top 25% for progress, maintaining its 2015 position and an improvement on 2014 when it was ranked at the 33rd percentile for progress.

Overall progress in maths is +2.4, which is significantly above average and ranks the schools at the 17th percentile. This contrasts with reading, the progress score for which is 0.0, indicating that progress in that subject was average. This is perhaps to be expected in a cohort comprising such a high proportion of EAL pupils, and should be considered alongside pupils’ progress in grammar, punctuation and spelling (GPS). Here the progress score is +4.1, which is significantly above average, demonstrating that pupils made excellent progress in this subject in comparison to pupils with similar start points nationally. The school is ranked at the 6th percentile (top 6% of schools) for progress in GPS.

FFT analysis shows that all but two pupil groups made more than average progress, with many making significantly more than average progress. The ‘any other’ ethnic minority group of pupils (8 pupils) made average progress, and SEN support group making less than average progress. However, it should be noted that SEN pupils are often shown to make less than average progress in a VA model where they are compared against pupils nationally with similarly low start points – a group that includes EAL pupils.

RAISE Summary report

Progress at Key Stage 2
Overall progress in writing and maths was significantly above average and was positive for all pupils and disadvantaged pupils in each prior attainment band. Notably, progress in maths was significantly above average for disadvantaged pupils overall, and significantly above average in writing and maths for disadvantaged pupils in the low prior attainment group. As in FFT data, progress in reading was 0, indicating pupils making average progress.
Further analysis of pupil group progress data reveals that no groups’ progress was not significantly below average in any subject and many groups were significantly above in writing and maths. A notable area for further investigation is the progress of high prior attainers, which was broadly in line with average in maths and below (but not significantly below) average in writing.
Please note: overall low/middle/high is defined by KS1 APS. Subject low/middle/high refers to pupil’s level in that particular subject at KS1.

Attainment at Key Stage 2
49% of pupils achieved the expected standard in reading, writing and maths at KS2 which is just below the national average. No pupils achieved the higher standard in the three subjects combined. Percentages meeting the expected standard in writing and maths are above national average overall and for each prior attainment band; and disadvantaged pupils’ attainment of expected standards was generally in line with the national averages for non-disadvantaged pupils in these two subjects, and above in the case of low prior attaining disadvantaged pupils. No gaps are therefore identified in writing and maths in terms of percentages of disadvantaged pupils meeting expected standards.
The key issues are twofold: attainment of expected standards in reading and greater depth in writing. Here, large gaps from national average are identified particularly for the middle prior attaining group, and the gap between the middle disadvantaged group and non-disadvantaged pupils nationally in reading is notably wide (-4). Gaps of -2 pupils or lower are classified as ‘well below’ average. There is also a gap identified in terms of high prior attainers achieving the high standard in reading, writing and maths combined. This is due to the lack of pupils achieving greater depth in writing.

Attainment at Key Stage 1
Attainment of expected standards at KS1 was above and well above average in all subjects overall and for each prior attainment group. Disadvantaged pupils’ attainment of expected standards was above average in writing and well above in reading and maths. All but one of the non-emerging disadvantaged pupil group (i.e. those disadvantaged pupils that had met the early goals) achieved the expected standard at KS1.

FFT data shows that 63% of pupils in this cohort achieved expected standards in reading, writing and maths combined, which is 12% points higher than estimated when pupils’ EYFS outcomes are taken into account. As at KS2, this equates to around 10 more pupils achieving expected standards in the three subjects than perhaps would be expected in a school where average progress is made. FFT data shows that attainment of expected standards at KS1 was above expected for nearly all groups, and significantly above in many cases, most notably Bangladeshi, FSM pupils and lower prior attainers. Only SEN pupils’ attainment fell short of estimated outcomes – the gap equating to one pupil not meeting expected standards.

As at KS2, it is achievement of greater depth (high standard) that is the key issue, particularly for ‘middle’ prior attainment pupils - in this case, those that had met the early learning goals in that specific subject at EYFS (EY expected). FFT shows that the percentage of high prior attainers achieving greater depth in reading, writing and maths combined (25%) was in line with ‘expectations’. However, the percentage of middle prior attainers doing the same was below: no pupils managed the higher standard in all three subjects, which is deemed to be 5% (or 2 pupils) below estimated. 

RAISE shows gaps ranging from 1 pupil below average (EY ‘expected’ disadvantaged pupils achieving greater depth in writing at KS1) to 6 pupils below (EY ‘expected’ pupils (all pupils) achieving greater depth in reading at KS1). A further investigation into middle prior attaining pupils achieving high standards is recommended. It is, for example, likely that many pupils placed in the EY ‘expected’ prior attainment group, based on EYFS outcome in that particular subject area, did not achieve GLD.

Percentages of EY ‘exceeding’ pupils (the high prior attainment group) achieving greater depth at KS1 are also low but numbers of pupils are small and gaps do not generally equate to 1 pupil, except in the case of writing, where the school figure for all pupils is flagged as -2 (i.e. 2 pupils below average). It should be noted that there are no pupils in the EY ‘exceeding’ group for maths and only 4 pupils are in this group for reading, in contrast with writing where 10 pupils are identified as exceeding based on EY outcomes. This is unusual as writing is usually the lower of the 3 EYFS areas.

Phonics continues to improve overall and for all key groups. Notably, the percentage of disadvantaged pupils achieving the expected standard in phonics is above that not of non-disadvantaged nationally, and has been for the past 3 years. 95% of disadvantaged pupils achieved the expected standard in Y1 in 2016.

Overall progress at KS2 is high particularly in writing and maths where scores are above and significantly above average for certain groups. Progress in reading is in line with national average but this is perhaps to be expected for a cohort comprising mostly EAL pupils. Progress in writing for high prior attainers is below average but there are serious concerns over the accuracy and reliability of teacher assessment in writing nationally. Three more pupils assessed at greater depth instead of expected standard in writing would have turned the negative progress score into a positive.

Overall attainment at KS2 is broadly in line with national averages. RAISE highlights relatively low attainment for middle prior attainment group in reading and the middle and high prior attainment groups in writing. Attainment of lower prior attainers tends to be in line with or above average. This is reflected in the progress measures (see above).

Attainment at KS1 is similar to KS2 with relatively low percentages of middle prior attaining pupils achieving higher standards. However, it is likely that strong improvements in phonics alongside the further embedding of the new curriculum will see this situation change in future years.

The key issue arising from data is the relatively low progress made by middle prior attaining pupils and this should be a key focus in future.