Anyway, no doubt you've all now downloaded your data from the tables checking website (and if you haven't, please do so now. Guidance is here) and have spent the last week trying to make sense of it, getting round what -1.8 means and how those confidence intervals work. Perhaps you've used my latest VA calculator to recalculate data with certain pupils removed, or updating results in light of review outcomes, or maybe changing results to those 'what if' outcomes.

This is all good fun (or not depending on your data) and a useful exercise, especially if you are expecting a visit, but it's important to understand that the DfE has made changes to the methodology this year - some of which I predicted and some of which I didn't - and, of course, the better we understand how VA works, the better we can fight our corner.

So what's changed?

**Actually let's start with what hasn't changed:****1) National average is still 0**

VA is a relative measure. It involves comparing a pupil's attainment score to the national average score for all pupils with the same start point (i.e. the average KS2 score for the prior attainment group (PAG)). The difference between the actual and the estimated score is the pupil's VA score. Adding up all the differences and dividing by the number of pupils included in the progress measure gives us the school's VA score. If you calculate the national average difference the result will be 0. Always.

School VA scores can be interpreted as follows:

School VA scores can be interpreted as follows:

- Negative: progress is below average
- Positive: progress is above average
- Zero: progress is average

Note that a positive score does not necessarily mean all pupils made above average progress, and a negative score does not indicate that all pupils made below average progress. It's worth investigating the impact that individual pupils have on overall progress scores and take them out if necessary (I don't mean in a mafia way, obviously).

**2) The latest year's data is used to generate estimates**

Pupils are compared against the average score for pupils with same start point in the same year. This is why estimates based on the previous year's methodology should be treated with caution and used for guidance only. So, the latest VA calculator is fine for analysing 2017 data, but is not going to provide you with bombproof estimates for 2018. Same goes for FFT.

**3) KS1 prior attainment still involves double weighting maths**

KS1 APS is used to define prior attainment groups (PAGs) for the KS2 progress measure. It used to be a straight up mean average, but since 2016 has involved double weighting maths, and is calculated as follows:

(R+W+M+M)/4

If that fills you with rage and despair, try this:

(((R+W)/2)+M)/2

Bands are as follows:

low PA: KS1 APS <12

Mid PA: KS1 APS 12-17.99

High PA: KS1 APS 18+

**4) Writing nominal scores stay the same**

The crazy world of writing progress continues. I thought the nominal scores for writing assessments might change but that's not the case, i.e.

WTS: 91

EXS: 103

GDS: 113

**This means that we'll continue to see wild swings in progress scores as pupils lurch 10 points in either direction depending on the assessment they get, and any pupil with a KS1 APS of 16.5 or higher has to get GDS to get a positive score, but GDS assessments are kept in a remote castle under armed guard. I love this measure.**

**5) As do pre-key stage nominal scores**

No change here either, which means the problems continue. Scores assigned to pre-key stage pupils in reading, writing and maths are as follows:

PKF: 73

PKE: 76

PKG: 79

Despite reforms (see changes below) these generally result in negative scores (definitely if the pupils was P8 or above at KS1). It's little wonder so many schools are hedging their bets and entering pre-key stage pupils for tests in the hope they score the minimum of 80.

**6) confidence intervals still define those red and green boxes**

These can go on both the changed and not changed piles. Confidence intervals change each year due to annual changes in standard deviations and numbers of pupils in the cohort, but the way in which they are used to define statistical significance doesn't. Schools have confidence intervals constructed around their progress scores, which involves an upper and a lower limit. These indicate statistical significance as follows:

Both upper and lower limit are positive (e.g. 0.7 to 3.9): progress is significantly above average

Both upper and lower limit are negative (e.g. -4.6 to -1.1): progress is significantly below average

Confidence interval straddles 0 (e.g. -1.6 to 2.2): progress is in line with average

**7) Floor standards don't move**

This shocked me. If i had to pick one data thing that I thought was certain to change it would be the floor standard thresholds. But no, they remain as follows:

Reading: -5

Writing: -7

Maths: -5

Schools are below floor if they fall below 65% achieving the expected standard in reading, writing and maths combined, and fall below any one of the above progress thresholds (caveat: if just below one measure then it needs to be sig-. Hint: it will be). Oh, and floor standards only apply to cohorts of 11 or more pupils.

**And now for what has changed**

**1) Estimates - most go up but some go down**

The estimates - those benchmarks representing average attainment for each PAG against which each pupil's KS2 score is compared - change every year. This year most have gone up (as expected) but some, for lower PAGs, have gone down. This is due to the inclusion of data from special schools, which was introduced to mitigate the issue of whopping negative scores for pre-key stage pupils.

Click here to view how the estimates have changed for each comparable PAG. Note that due to new, lower PAGs introduced for 2017, not all are comparable with 2016.

**2) Four new KS1 PAGs**

The lowest PAG in 2016 (PAG1) spanned the KS1 APS range from 0 to <2.5, which includes pupils that were P1 up to P6 at KS1. Introducing data from special schools in 2017 has enabled this to be split into 4 new PAGs, which better differentiates these pupils. The use of special school data has also had the effect of lowering progress estimates for low prior attainment pupils, which goes some way to mitigating the issue described here. However, despite these reforms, if the pupil has a KS1 APS of 2.75 or above (P8 upwards) a pre-key stage assessment at KS2

*going to result in a negative score.*

**is****3) New nominal scores for lowest attaining pupils at KS2**

in 2016, all pupils that were below the standards of the pre-key stage at KS2 were assigned a blanket score of 70. This has changed this year, with a new series of nominal scores assigned to individual p-scales at KS2, i.e:

P1-3: 59 points

P4: 61 points

P5: 63 points

P6: 65 points

P7: 67 points

P8: 69 points

BLW but no p-scale: 71 points

I'm not sure how much this helps mainstream primary schools. If you have a pupil that was assessed in p-scales they would have been better off under the 2016 scoring regime (they would have received 70 points); as it stands they can get a maximum of 69. Great.

**Please note: these nominal scores are used for progress measures only. They are not included in average scaled scores.**

**4) Closing the progress loophole of despair**

Remember this? In 2016, if a pupil was entered for KS2 tests and did not achieve enough marks to gain a scaled score, then they were excluded from progress measures, which was a bonus (unless they also had a PKS assessment, in which case they ended up with a nominal score that put a huge dent in the school's progress score). This year the DfE have closed this particular issue by assigning these pupils a nominal score of 79, which puts them on a par with PKG pupils (no surprise there). In the VA calculator, such pupils should be coded as N.

The loophole is still open by the way. Pupils with missing results, or who were absent from tests, are not included in progress measures, and I find that rather worrying.

**5) Standard deviations change**

These show how much, on average, pupils' scores deviate from the national average score; and they are used to construct the confidence intervals, which dictate statistical significance. This is another reason why we can't accurately predict progress in advance.

-----

So, there you go: quite a lot of change to get your head round. It has to be said that unless the DfE recalculate 2016 progress scores using this updated methodology (which they won't), I really can't see how last year's data can be compared to this year's.

But it will be, obviously.

Fantastic post - thank you.

ReplyDeleteJust to clarify though - why do the estimates have to change each year?

Could the estimates not be kept the same, but the scaled score conversion be adjusted to account for the change in average raw score achieved for that PA group.

So the raw score may fluctuate each year, needed to achieve the scaled score required, but the scaled score estimate could remain static - thus allowing the 2b PA group of 15 APS+ to always be expected to achieve 100 scaled score?

I sort of get where you're coming from. However, doing that goes against the concept of VA: that pupils are compared against the average score for their prior attainment group in that year.

ReplyDeleteThe scaled score conversion already does fluctuate each year to reflect changes in difficulty of the tests. The raw score required to achieve expected standard is set using results of sample and live tests. Once the expected standard has been set, they can then establish the scale. If you say in advance that all 2b pupils (bear in mind we only have 2 more years of them) need to get 100 to make average progress, then the underlying the raw score will change considerably. it doesn't really change anything: what children have to get to make average progress still changes every year. Either it's a raw score, or the scaled score.

Also I think what you're proposing changes the meaning and value of the expected standard. At the moment it represents an acabdemic standard we expect all pupils to meet; if we went with this idea, it would become the standard we expect 2b pupils to meet.

It would also probably result in the return of expected progress measures, progress matrices, and points-based progress measures.

15 to 100 is 85 points over the key stage.

That's approx 20 points per year.

I worry about stuff like that.

It perhaps makes more sense if you think about KS4. Perhaps the average GCSE result in maths for a pupil that was 4.5 at KS2 is a grade 4 this year. This becomes the VA benchmark for all 4.5 pupils. Next year that may increase to a grade 5, so the benchmark (expectation) increases to a grade 5. If we implemented the system you proposed for KS2, we would always expect a grade 4 for 4.5 pupils, but that the value of a grade 4 changes over time. This couldn't work because grades are supposed to be comparable (stable). Scaled scores are also supposed to be comparable year on year.

ReplyDeleteThanks - it's all very fascinating.

ReplyDeleteI now have so many questions.

I find myself wondering how the system would change if the expected progress where determined by a static attainment outcome I.e., 15 APS convert to 100 scaled score, as oppose to the current system where expected progress is determined by how well the national cohort does.

Theoretically if the national cohort do badly, then expected progress in that year would drop? A PA 2b child could fail to achieve ARE at ks2, yet still have a positive VA, if they exceed national average for the PA group? Hypothetical, I realise, yet an interesting thought.

Thank you for your insight - always appreciated.

But that's how it works now: it 2b children on average score 98, then 98 is the estimate, so if your 2b child scores 99 then they make positive progress.

ReplyDeleteThe phrase 'expected progress' concerns me. The removal of levels has resulted in the end of expected progress measure. Even Ofsted have stated that inspectors must not use the phrase. By fixing estimated outcomes to a specific scaled score from each start point, we have effectively reinstalled the measure.

It's a complex issue. Probably best discussing in person or by phone. Email me and we can arrange a phone call.

"But that's how it works now: it 2b children on average score 98, then 98 is the estimate, so if your 2b child scores 99 then they make positive progress."

ReplyDeleteYes, I agree with this - I think I'm understanding the current process now, however, I am wondering whether this is appropriate? A child who was age related at ks1, converting to below age related at ks2, yet still getting a +ve VA?

I shall email you and book a conversation - thank you.

Maybe even a chance for you to come up to Bolton and address our head teacher cluster?

Yes, in theory it's possible: if the national average score for 2b pupils was 98 then any 2b pupil achieving above that will have a positive VA score. However it is highly unlikely. Last year the national estimates for pupils with KS1 APS of 15 were around 100. This year slightly higher reflecting the increasing KS2 attainment of that prior attainment group. It will no doubt only rise as years go on.

ReplyDeleteThe other thing to consider is whether 2b really represents the national expectation at KS1. 1) KS1 assessments are highly subjective and therefore not especially reliable (one school's 2b is another school's 2c or 2a); and 2) 2b ceased to be national average many years ago. The last time sublevels were used, the KS1 APS was around 16.8 so much closer to a 2a.

Let's arrange a call.

Interesting Point to make about floor targets at Reading: -5

ReplyDeleteWriting: -7 Maths: -5 is the Bottom 5% is Reading: -3.9 & below Writing: -4 & below Maths: -4.3 & below.

Does that mean no school is below floor?

Julian Wood (@ideas_factory)

No. It just means less than 5% are below floor (it was approx 5% last year). Surprised that they kept them the same but this way you can show improvement due to fewer schools below floor.

ReplyDeleteGenius.

Thank you James for your help and clarity.

ReplyDeleteIs the measure of 17+ still deemed as above average?

What is 'above above average'?

Thanks

You mean at KS1? Prior attainment bands as follows:

ReplyDeleteLow: <12 APS

Middle: 12-17.99 APS

High: 18+ APS

Hope that helps

Hi James. I'm currently working out our low, middle and high attainers from year 3 and 4. Can I just clarify that if a child's results were: WTS, WTS and ARE, they would still fall into the 'middle' bracket.

ReplyDeleteThanks

There's no guidance for this so your guess is as good as mine. FFT have come up with a logical method. Alternatively devise your own. I wouldn't be surprised if DfE assign KS1 assessments a nominal score in order to calculate APS. Guess we'll have to wait 2 years to find out.

DeleteOk, thanks for your quick reply.

ReplyDeleteHi James

ReplyDeleteOur overall combined progress measure for 2017 was +2.6 (sig +) as oppose to +4.8 (sig +) in 2016.

The FFT aspire dashboard denotes a downward arrow for progress on the 2017 dashboard front page.

Given that nationally each PA group made on average about 1 to 1.5 scaled points more in 2017 than they did in 2016, would it be correct to say that if my pupils attainment remained in line with that of last year, and from their starting points they reached the same end points as they had in 2016, then the school VA, would in fact be approx 1 to 1.5 points lower than last year.

Given that VA is a measure of the difference between scaled score achieved at the school, and that found nationally on average, then in my case the gap has narrowed not necessarily because my children made significantly less progress, but because nationally the children made more progress - hence the gap has narrowed.

I am finding myself having to argue that our progress is not significantly worse than last year, ( although it is slightly worse) but the gap has narrowed and thus VA fallen because the nation has done better on average, not that our children have done worse.

I refer back to a previous blog of yours in which you clearly argue that 2016 data is not easily compared to 2017 data, yet a down arrow on FFT dashboard implies one is being judged against the other.

If the 2016 VA were to be recalculated on the basis of the 2017 calibration scale, then I am aware that my 2106 VA would drop from the dizzy heights of +4.6, but maybe the difference between 2016 and 2107 would then be not quite such a chasm, that I find myself having to justify.

Im sure it is not often that schools ask for their VA to be reduced, but it seems if data from one year to another is to be compared then the same calibration scale needs to be used.

Would be interested to know your thoughts and I keep coming back to trying to find a way of calculating VA, using a stable measure of expected progress, that does not vary year to year according to how well or how bad the national cohort does.

Thanks

In a word: yes! If your pupils from same start points achieved the same as last year, your VA would definitely be lower unless your cohort comprised very low prior attainers, whose estimates actually dropped this year due to inclusion of special school data. Most schools would however see a drop of 1-1.5 as you say, because other pupils nationally have done quite a bit better this year. It’s a zero sum game unfortunately. Heard a few HTs half jokingly say ‘I wish we hadn’t done so well last year’.

ReplyDelete