Wednesday, 17 December 2014

The Emperor's new tracking system

Those familiar with 'comment is free' on The Guardian website will know that nothing divides opinion and generates quite as much vitriol as an article about the latest smartphone. Not even the recent debate on Scottish Independence could compete with the iPhone 6 in terms acrimony. It's a tribal thing, and I've noticed something similar in schools. You can go into a school and tell them their results are poor and it's a fair cop. Criticise their tracking system on the other hand and people get seriously defensive. But criticise them I will, and more so this year when many systems have paid lip service to assessment without levels by doing a bit of window dressing.

Over the course of this term I've spent much of my time discussing what various tracking systems - both commercial and those developed in house - are doing with regards assessment without levels, and a certain track by The Who gets lodged in my brain on a loop.

"Meet the new boss, same as the old boss."

In many schools the tracking system rules, to the point where I've heard senior leaders say "we like this method of assessment but it doesn't fit with our system", which is depressing and utterly the wrong way round. The tracking system is supposed to be a servant, not a master; a tool to inform teaching and learning, for self-evaluation. It should be tailored to fit the new curriculum not the other way round but here we are attempting to shoehorn the new curriculum into systems that are deeply entrenched in the old 3 points per year methodology. These 3 points may now be called steps, and may be rebadged as 'emerging', 'developing', 'secure' (these seem the most common terms) but let's not kid ourselves: they're just levels by another name with some even attempting a conversion back to old money. A case of the Emporer's new tracking system. 

I think many of us assumed that, despite the new curriculum and the death of levels, progress would continue to be measured in much the same way, with extension into the next year's curriculum being the order of the day. So, pupils could be categorised as working above (or way above) age-related expectations and progress shown in terms of years/months, or in points, much as we have done previously, with 3 points/steps being the expectation. An 'exceeding' child would be one working above their curriculum year, and good progress would be 4 or more steps a year.

Well, that's what we thought. But then the penny dropped: it wasn't about extension, it was about depth of understanding. All that mastery business.

So we have systems that were built to show rapid progress towards a goal of as many pupils as possible being above age-related expectations, now trying to measure achievement in a curriculum, which expects all (or nearly all) pupils to cover the broadly same content at the approximately same rate, it's just their depth of understanding that will differ. As a headteacher said to me recently: "coverage is irrelevant". I'm still not sure how true that is but it's a cool soundbite and would look neat on a t-shirt.

And so, as this big, weighty penny hits the ground with a loud clang, the advice has changed . The original answer to the question about how we show progress - i.e. "just classify them as a year above" - has changed to "don't classify them as a year above". Pupils working in the next year's curriculum become the exception rather than the rule. I note that this shift in thinking has resulted in the quiet dropping of the term 'exceeding' from the tracking and assessment lexicon as people realise that 'exceeding' is a difficult thing to define when pupils are no longer progressing into the next year's curriculum and beyond, but are instead drilling deeper.

What this means for many schools, as they carry out their autumn assessments and enter them into their tracking systems, is that pretty much all pupils are being categorised as 'emerging' in terms of that year's objectives. Next term they'll be 'developing' and by the summer they'll all be 'secure'. Hurrah! But a tracking system that doesn't adequately differentiate between pupils is fairly pointless really; and what's missing from all this is the depth of understanding. The terms 'emerging', 'developing' and 'secure' are generally being used to denote coverage of curriculum content, each linked to a third of objectives achieved (or a sixth if across 2 years). They do not indicate the degree to which the pupil has mastered the subject. That's a different matter entirely and one that is only just beginning to be addressed by tracking systems, most of which are still locked into a concept of progress based on rapid extension.

Ironically, it is the lower ability pupils that these systems serve well as they race to catch up with age-related expectations, and are therefore able to make rapid rates of progress in the traditional sense of the word. Pupils that are where you expect them to be at the start of the year will probably all make expected progress, and best not have any higher attainers. They'll most likely go backwards this year, and make expected progress after that. 

Clearly there needs to be a radical rethink of our approach to the tracking of assessment data where depth of learning is central to our concept of progress rather than some add-on feature. But there are still lots of questions to answer and debates to have over the course of this year. Can we confidently define a level of mastery at any point in the year? Can we use an average depth of understanding to compare groups of pupils or subjects? Can we track progress through changes in the depth of understanding? Is that any more or less useful than APS? Can an SEN pupil working at a much lower 'level' still show mastery in their work? I hope so. However, until we let go of the comfort blanket of pseudo-levels we're not going to solve these issues and come up with a fit-for-purpose system that works in harmony with the new curriculum rather than attempting to straightjacket it. 

So, forget the old boss and do something different.

We won't get fooled again.

Wednesday, 10 December 2014

Calculating VA at KS2 beyond levels

Understandably there is a great deal of confusion about how the DfE can calculate VA once the current assessments are replaced in 2016 with new tests that give scaled scores. But we know that a VA measure will exist - it will be centred on 0 rather than 100 - and will be one of 4 key performance measures at key stage 2. It is important to understand that VA does not require data in compatible formats at either end of the measure, just as long as you have a measure at the start and finish.

VA is a measure of progress that compares a pupil's attainment against a national average for pupils with the same prior attainment. It has nothing to do with levels of progress. Prior attainment could be in sublevels, scaled scores, fruit, woodland mammals, or shades of grey (maybe not the latter). Here's an example: a series of 10k races is held across the country, with runners wearing one of three vests: red for slower-paced runners, green for medium paced runners, and blue for faster runners. You could then compare the race times of a runner wearing a green vest against that of the average green-vested runner nationally. If they beat the time then they have a positive score; if they meet it, their score is 0; if they fall short, their score is negative. That's sort of how VA works.

It's worth noting that FFT already measure progress from EYFS to KS1 using the same methodology: they compare a pupil's key stage 1 result against the national average outcome for pupils with the same prior attainment at EYFS. No conversion of EYFS data to APS takes place; it's not necessary. In fact it's a bit of a fallacy that KS1 and KS2 data was ever in the same format: one was a teacher assessment whilst the other was a fine grade converted from a test score.

So, what happens after levels? Well, same thing really: we compare a pupil's key stage 2 scaled scores against the national average scaled score for pupils with the same prior attainment at key stage 1. I've put together the following diagram to explain this:

In this example, pupil A is +6 and pupil B is -4; the average VA for this (admittedly small) sample is therefore +1 ((6 + -4)/2). More importantly, it should be noted that Pupil A has exceeded the VA estimate, and has a positive VA score, despite not having met the expected standard. Conversely, it is possible for a pupil to meet or exceed the expected standard but still end up with a negative VA score if their scaled score is lower than the national average score of pupils with the same start point. 

Hope that helps to clear up some of the confusion surrounding this issue.

Wednesday, 5 November 2014

Mind the gap! The devilish detail in the RAISE closing the gap report

On the face of it, it looks like an improvement. The closing the gap report in last year's RAISE seemed like an afterthought and once you'd been through the main report you knew it all anyway. I never really saw the point in it. But this year it's changed: 3 year trends for expected and better than expected progress, VA, APS, and L4/5+ reading, writing and maths, all in one neat 2 page report. What's not to like? I actually thought it was a useful addition, until someone asked me about the red.

"Erm, red?"

It turns out the two RAISE reports I'd briefly looked at didn't have any red so I'd not really had cause to read the blurb. Well, I've been busy. That's my excuse.

So I looked at some other reports. 


I don't want to go into too much detail about the method as it's already been covered in depth here:

And if you want to read the DfE guide, it's here:

I just want to share one table from a primary school's RAISE report and have a bit of rant. Check this out:

Here it is in all it's glory, with plenty of red. Just to clarify - in case you've not read the blurb yet - red indicates where the figure for disadvantaged pupils is lower than that of non-disadvantaged pupils nationally, and the difference is greater than the % value of 1 pupil in the group. There's no significance test, no confidence intervals, no standard deviations. Just a big, red flashing light. And only disadvantaged pupils get a red light by the way, and only in 2014. Obviously. And no one get a gold star, which kind of sucks.

So, have a look at the table again, and focus particularly on the top section (maths) and on the disadvantaged pupils that were level 3 on entry. There were 4 of them. And ONLY half of them - that's just 2 whole pupils - made 2 levels of progress. I'm shocked. And what's worse, nationally 92% of non-disadvantaged pupils that were level 1 on entry made expected progress. The difference is -42%. And 42% is greater than the % value of one pupil in that cohort (i.e. 25%). Klaxon sounds, red light flashes, periscope down, "DIVE, DIVE, DIVE".

4 pupils in the group

Compared against a cohort of around 100,000

So, that stands up.

(I'm guessing about the 100,000, but accuracy is low down the priority list anyway)

Now, have a look at the reading section. Focus on the disadvantaged pupils that were level 1 on entry. There were.......wait for it.......2. And none of them made 3 levels of progress. None. 0%. This just isn't good enough. And to prove it we are told that, nationally, 64% of non-disadvantaged pupils, with the same start point, made 3 levels of progress.

0% of 2 vs 64% of 64,789 (I totally made that figure up) = -64%

This data could only be improved if Chris Morris was presenting it.

Or maybe David Mitchell.

So, that's my rant. I hope you've enjoyed it as much as I haven't.

Oh, and in the process of finding my Underground photo above, I discovered that 'mind the gap, is a whole other internet thing. Best not search at work.

Over and out.

Wednesday, 29 October 2014

Without feathers? My LA swan song.

So, that's that then. Today was my last day working at Gloucestershire LA after nearly 5 years of wading neck deep in data; supporting schools, battling Ofsted, and trying not to make a nuisance of myself but generally failing. It's been quite an experience being on the frontline of school improvement during this most challenging and bewildering period of change in education; and (I must admit) I've enjoyed it. Most of the time. So when I started to think about what I wanted to do next, I realised I didn't really want to do anything different; I wanted more of the same. I like doing what I do so I should carry on doing it. Taking the plunge and becoming self-employed seemed the obvious (if somewhat scary) choice, and with more change in the post this year (and the next, and the year after that) now is probably the best time to do it. Going freelance will enable me to support a wider variety of schools, get involved in collaborative projects, and work with 3rd party organisations. I've got a laptop, a mobile, and a logo; what could possibly go wrong?

I joined Gloucestershire LA's school improvement team in January 2010, after 4 years as a landscape gardener/stay at home Dad. Prior to that I was a data analyst for the LSC (remember them?), And before that I tried to be a teacher but really wasn't good enough (in case anyone isn't aware, teaching is actually quite hard work). In other previous lives I've been a database administrator (fun!) and completed a Ph.D in Granites (that was a low point). I also worked on a chicken farm when I was 16. I'll return to that later. 

When I took the job in 2010, most of Gloucestershire's school services were based at the Hucclecote Centre. Those familiar with education in Gloucestershire will probably know it well. Famed for its training courses - and probably even more famed for its lunches - it was a vibrant place to work. But from May 2010, with the coalition government in power signalling the end of National Strategies and the start of austerity, the writing was on the wall for the centre and many who worked there. Hucclecote (or Chucklecote, or Chuckles) was one of the last of its kind, and its days were numbered. The last gasp was an archaeological dig for some foundations in the car park. And that was it - we relocated to Shire Hall. No more Chuckles. No more lunches.

In a very short space of time we'd gone from a large community of 200 people based in an old secondary school, to a team of fewer than 20 in one room (Well, I was in a yellow-painted office for a bit but it gives me headaches to think about it). Now, I don't want to bore you with a blow by blow account of what I've been doing for the last few years but I do want to give you some simple stats (obviously!). In September 2012, shortly after we'd moved to Shire Hall, and been pared down to the bare bones, around 70% of schools in Gloucestershire were rated good or outstanding, and 60 primary schools (out of 240) were satisfactory. Our team's focus became these schools, whilst trying to spot those good or outstanding schools that were at risk of travelling south (bear in mind that nearly all secondary schools are converter or sponsored academies so the team were mainly primary focussed). Wind forward 2 years and now 90% of primary schools are rated good or outstanding; and 92% of primary pupils attend a good or outstanding school, the highest in the south west. An extraordinary rate of improvement due, in no small part, to the intervention, challenge and support of Gloucestershire's excellent school improvement team. So when I hear snide remarks about the LA - classics include "Is anyone left?", "what do they actually do?", or "must be like the Marie Celeste in there" - I get a tad annoyed. It was heartening to hear Mr Drew (of Educating Essex) at Cheltenham Literature Festival this month support his LA, stating that he felt no more 'free' now as an academy than he did before conversion, and that he'd only ever had great support from his LA. 

Say what you like about LAs, my (very experienced) colleagues work bloody hard for one common purpose: to support schools. That's it. And their approach to school improvement evidently works. Personally, I can't see why a high performing LA with a proven track record in school improvement, of which Gloucestershire is certainly one, couldn't be an academy sponsor, both in and out of county. They'd do a fine job. Maybe a better job than some other organisations. Food for thought. 

So, what now for LAs. Well, regardless of what happens in terms of the proliferation  of academies, I believe there needs to be a strong, experienced, well trained, locally-based school improvement service and the LA is the obvious place for it. Many recent inspections of LA school improvement services, where the LA has been deemed to be ineffective, have had a recurring theme: that the LA has not done enough to support and challenge the academies in its area. This is a bit of a kick in the eye considering the whole premise of becoming an academy is that the school is removed from LA control. It would seem that the DfE wishes to take credit for the successes but pass the buck when things go wrong. Surely that can't be the case, can it?

Which brings me neatly back to the chicken farm I worked on when I was 16. One day, down in shed 1, I noticed a tiny, featherless chicken running amongst the thousands of other birds. Every chicken it ran past would peck at it mercilessly, so on it ran, darting left and right, back and forth, on a futile mission to avoid attack. Every step was met with another blow. A depressing and pathetic sight. I mentioned it to the foreman later on that day and asked if we should put the poor thing out of its misery. His response? "If we kill it, they'll just start picking on another one. Best keep it alive as long as possible".

I find myself thinking about that poor chicken a lot these days, running without feathers under continuous attack. Is that why LAs now exist? Kept alive solely to sustain attacks and deflect attention. Have they been reduced to mere whipping boys for government policy? A cynical view perhaps but understandable when you read another inspection report of ineffective school improvement services. And whilst some in government may see things that way, the reality is that LAs are still here, they do have a responsibility for school improvement; and it's a job they take very seriously and do really well. My colleagues are ace!

Rumours of local authorities' demise have been greatly exaggerated. 

Wednesday, 8 October 2014

The Good of Small Things

I recently sent out a tweet asking for thoughts and experiences relating to small schools, particularly regarding data. Now, I could have written a lengthy, and probably rather dull, blog on the subject of data and small schools but instead I thought I'd just publish the following email, which perfectly and succinctly summarises the problems facing these settings when it comes to accountability measures and inspection. Regrettably, the author wishes to remain anonymous but you know who you are. Thank you so much for the following contribution:


Hi James,

I saw your tweet about wanting to hear about the experiences of small schools on specific data issues and thought I’d rather email than tweet.

I write from the perspective of being a governor and volunteer data handler for a small school (96 pupils).


Whilst small pupil numbers are a real benefit in terms of the school being able to concentrate on looking at each pupil as an individual, noting each child’s attainment and progress and the next steps required I think small schools have particular problems when it comes to handling cohort and group data. No doubt you’ll be familiar with these – though perhaps no. 4 is only just becoming an issue with more focus on this in inspections.

1.       Small pupil numbers mean that we don’t have sufficient sample sizes to be confident that our collective data is telling us much at all. Our results have to be very good (or very bad!) to be classed as statistically significant on RAISEonline. Even worse if trying to analyse groups.

2.       It’s very hard to spot trends because we need results over several years to collect sufficient sample sizes. By the time we can be confident of a trend through data analysis, things will have moved on anyway.

3.       I have no confidence that the people who judge us by our data will necessarily be competent statisticians. In my experience some people start off their analysis with a general rider about small cohort numbers requiring caution and then throw caution to the wind and proceed to read far more into the data  than it can possibly support.

4.       I particularly worry about proving to Ofsted that “From each different starting point, the proportions of pupils ... exceeding expected progress in English and in mathematics are high compared with national figures” and hope that any inspector who visits us understands that if we only have 1 pupil at a particular starting point, say 2b in reading, and that child fails to reach a Level 5, our score of 0% making more than expected progress from that starting point is actually in line with a national percentage of 29%. I do hope inspectors are being briefed to read the data properly and would love to access an authoritatively written statement on this that properly explains the situation.

5.       Small schools (often in underfunded ‘leafy’ areas with little or no pupil premium) often do not have the budget to be able to afford or justify what seem like expensive data handling packages – especially given the question of how much use they will really be when sample sizes are so small. Even FFT is hard to justify now the LA no longer subscribe.

6.       In small schools where the head is also a class teacher and probably the premises & catering manager and there is no deputy head and perhaps just a single school secretary/administrator, he/she will have to balance data management along with all their other tasks. Where do they get time in the day to really master this subject let alone source useful software and keep abreast of new government initiatives?

7.       Headteachers find it hard to discuss the progress of disadvantaged pupils with governors if there are so few pupils receiving the pupil premium that discussion will lead to identification.


Hope the above is of some help and thanks for all your work shared over Twitter,


As stated above, I think this email perfectly encapsulates the issues experienced by small schools when it comes to data, particularly that presented in RAISE, the Ofsted Dashboard and Performance Tables. There is some provision for this in the Ofsted Handbook, which headteachers of small schools should be aware of:

'Where numbers of pupils are small and achievement fluctuates considerably from year to year, inspectors should take into account individual circumstances when comparing with national figures, and should consider any available data of aggregate performance for consecutive cohorts' (Ofsted Inspection Handbook, p.20, section 60).

This gives small schools the go ahead to merge cohort data and produce aggregated attainment and progress figures for, say, the last three Y6 cohorts, which will result in a more viable data set and more meaningful analyses. On a number of occasions I've calculated 3 year aggregated VA (using my VA calculator - see other blog), which has proved to be very useful indeed and made a positive contribution to the overall outcome of inspections. 

Also, small schools should access their FFT data, which provides 3 year average VA and CVA, and trend indicators. In addition, the new FFT Aspire system will allow schools to toggle between single year and 3 year averages. Potentially a very useful feature, especially when dealing with small cohorts or groups. If you are wondering whether or not inspectors will be interested in or accepting of FFT data, take note of the following:

'Evidence gathered by inspectors during the course of the inspection should include: any analysis of robust progress data presented by the school, including information provided by external organisations' (Ofsted Inspection Handbook, p.66, section 195)

I'm sure the issues outlined above resonate with staff in many if not all small schools up and down the country; and there are clearly specific challenges facing those schools. But there are ways round them and I hope that this has given you some ideas. Please get in touch if you have any questions or comments.

Tuesday, 23 September 2014

Improving school improvement: how LAs use FFT to support and challenge schools

Despite the reduction in scale and scope of local authority education services over the past few years, their school improvement remit remains. LAs have a responsibility to support and challenge schools, to intervene where necessary and to help drive up standards; the aim being for all pupils to have access to a good standard of education. 

School improvement teams rely on a number sources of data to gain insight into school performance; and use that data to identify schools whose performance gives cause for concern. Such schools may be below or close to DfE floor standards, have key groups that are underperforming, or show a downward trend over the past three or more years. Data sources that feed into the school improvement process include those in the public domain such as the DfE school performance tables, Ofsted reports, Ofsted data dashboard, and statistical first releases, which all provide vital information. Then there are NCER systems such as Keypas and EPAS, which provide a huge amount of attainment and progress data from national level down to individual pupil test scores. These systems release data early in the autumn term, thus giving LAs an early look at standards and an often vital head start. Beyond that, LAs make good use of RAISEonline in order to gain an 'Ofsted-eye' view of standards; and issues arising from the RAISE report will often form the basis of a conversation with a school.

In Gloucestershire, the School Improvement team goes one stage further, going through every RAISE report as soon as they are published. Whilst the various sources of data mentioned above provide essential information on standards, they are generally no substitute for getting down to the nitty gritty of pulling a RAISE report apart. If you really want to know what Ofsted thinks about a school's performance then you need to do more than study headline figures. RAISE, however is somewhat lacking as a school improvement tool, and that's where FFT comes in.

The FFT Governor Dashboard

FFT's simple yet intuitive summary report has proved to be a real game changer for both schools and LAs. As its name suggests, it is aimed at school governors, and the report is already an essential element of our data training package for school governors. However, the dashboard has also become popular with school improvement colleagues and senior leaders, all of whom appreciate its simplicity, clarity and focus; and it has now woven itself into the fabric of school improvement. For myself and many of my colleagues it has become the preferred report to gain a snapshot of a school's performance. Despite its mere 4 pages (3 if you don't count the title page), you can gain almost as much from an FFT dashboard as from an entire RAISE report.

Key to its usefulness are the progress comparisons, whereby a school's results are compared against estimated outcomes based on progress made by pupils nationally with the same prior attainment. This is critical when trying to make sense of data from other sources. We may know that a school is above or below the floor standard but what does that mean in terms of that particular cohort? The FFT dashboard allows us to quickly differentiate between those schools that have low attainment due to poor progress and those that have low ability cohorts. It is always interesting to see a school whose attainment dial is in the red and whose progress dial is in the green. The reverse is perhaps even more interesting, from a school improvement point of view. Whilst you can get this information from RAISE by studying attainment and VA data, it is the side-by-side presentation of these data in the FFT Dashboard that makes it such a useful tool for school improvement.

FFT Self Evaluation reports

An extension of the dashboard, these reports give greater detail on progress of pupil groups, both in terms of VA and CVA, with indicators to identify any significant trends over the last 3 years. Future estimates are given for current cohorts, as well as rankings for past performance. Rather than go further into the detail of these reports, I want to share a case study that illustrates how FFT data was used successfully during a recent Ofsted inspection of a primary school.

The school is in a deprived area and has very low ability intakes. Pupils make excellent progress across KS2 - VA placed the school at the 1st percentile in 2013 - and attainment is high. KS1 results however, particularly %L2B+ and %L3+, were below average; and it was KS1 that became the focus of the inspection. One of the limitations of RAISE is its lack of KS1 progress data and prior attainment at EYFS, which means that it is the responsibility of the school to demonstrate pupil progress across KS1. This is particularly where attainment at KS1 is low. The school's tracking data showed pupil progress was good but the low attainment at KS1 continued to be a stumbling block. This is where the school's FFT KS1 self-evaluation report proved invaluable. Like RAISE, it indicated that the school’s KS1 results were significantly below national average, and ranked it below the 90th percentile. VA, on the other hand (i.e. pupil progress compared against that made by pupils with the same EYFS prior attainment nationally), placed the school above the 5th percentile for all indicators, and CVA ranked the school even higher (1st percentile for one measure). These nationally benchmarked progress data and associated rankings, not available in RAISE and not feasible from internal tracking systems, demonstrated that pupils made comparably high progress across KS1 and KS2. The issue of low attainment at key stage one was put into context and the school got the outcome they deserved. 

Estimates & Target Setting

First, let's clarify something: FFT don't set targets; they provide estimates. These estimates can provide the basis of a conversation about target setting and that's what we tend to recommend when working with schools in Gloucestershire. For simplicity's sake, and with view to the future (i.e. FFT Aspire) in mind, I tend to stick to PA (prior attainment) estimates, as these are more in line with the estimates used for VA in RAISE, and focus on 50th (average), 20th (high), and 5th (very high) percentile estimates.  Whilst I understand that target setting should not be about getting a better RAISE report, I do think that schools value having a clear indication of what constitutes 'average' progress and an idea of what pupils would need to achieve in order for the school's VA to be significantly above average (hint: aim for the 20th percentile). Targets based on FFT estimates are certainly more meaningful, realistic and achievable than those based on a blanket approach of setting 4 points per year adopted in many schools. My own analysis of pupil level data in RAISE indicates this is aiming for the 3rd percentile. 

FFT estimates can also help school improvement teams identify schools that are at risk of falling below floor standards. By exporting school level data from FFT Live for current year 6 or 11 cohorts, we can filter on those schools whose future school estimates (based on prior attainment, context and the lasts 3 years results) suggest they may fall below floor standards next year. Whilst these are, of course, only estimates, and will always be considered alongside other sources of information - including the advisor's knowledge of the school - they provide a valuable, early indicator of potential standards, and can form part of the conversation with that school.

The future

This is where things get interesting. There are (at least) three other ways in which LAs can make good use of FFT data, two of which can be done now and I encourage LAs to makes better use of these features. The first is the student explorer. Schools are shocked by this when they see it: the entire pupil census history for their pupils at their fingertips and they didn't know it was there. Not only does it contain pupil characteristics, but attendance and complete school history at the click of a button. I showed it to a secondary school data manager recently and he had the facial expression of someone stepping into the TARDIS. At Gloucestershire LA, we are starting to use it to identify potential NEET pupils by exporting the entire county Y10 and 11 data set, and using certain criteria such as attendance, prior attainment and number of school moves as risk factors. There is therefore potential for this data to be used across teams, departments and even agencies.

Next up, collaborate. With increasing numbers of schools forming federations, partnerships, multi-academy trusts, or working closely in clusters, there is increasing demand to be able to analyse data across a number of sites. Schools are interested in producing benchmarking data for their cluster, or identifying strengths and weaknesses in a particular geographical area. These collaborations in FFT need to be set up by an LA administrator and should be encouraged. By creating these groups, LAs can help foster better partnerships between schools and gain insight into issues in certain areas. Collaborate is already a useful feature but is set to become even more relevant as the education landscape evolves.

And finally, Virtual Schools. Not a feature in FFT Live but it's coming soon to FFT Aspire, giving LAs the power to set up blank schools in FFT, which can populated with pupils by pulling their data into the school via their UPN. This is exactly what Virtual Schools have been crying out for: the ability to analyse the progress made by all their pupils regardless of location. And if there was ever an argument for the appropriateness and relevance of CVA it is surely here, when dealing with children in care. If this feature can be extended to PRUs, even better. 


So that's just about wraps up this blog on how school improvement teams can make use of FFT. How it challenges RAISE and other sources of data, and supplements our own intelligence about schools. It gives us detailed insight into school performance at all key stages, provides benchmarks, guides target setting, helps schools collaborate, and enhances cross-team and multi-agency working. In short, it provides LAs with an indispensable array of intelligent analyses. There is certainly more to FFT than D.

Tuesday, 16 September 2014

2B or not 2B, that is the question

I've visited a number of schools in the past couple of weeks and nearly all of them intend to continue tracking with levels for all cohorts for this term at least (fine for years 2 and 6, of course). This doesn't surprise me - it's the comfort of the familiar - but it's a bodge and I am becoming increasingly concerned. The big problem is that continuing with levels gives the false impression of parity and compatibility of data either side of the old/new NC boundary. This will inevitably invite comparisons, which are unlikely to do the school any favours. It's like comparing currencies pre- and post-decimalisation. By making a fresh start - using an entirely new approach - any such issues can be avoided. A line has been drawn.

The main issue is that pupils are going to appear to have gone backwards. Schools continuing with a levels-based system are planning to assign those pupils that have met all the key learning objectives for that point in the year a sublevel/point score that historically indicated age-related expectations (ARE) under the old system. So, a pupil that has met all learning objectives for the end of Y4 will be assigned a 3b/21 points, because that's how it used to work. That's the theoretical equivalent. 

Sounds fair enough.

However, the new curriculum does not translate into levels, and those old 'age-related expectations' are not a proxy for having met the key learning objectives of the new curriculum. Implying that they do is going to cause problems. I'll give you an example:

A pupil finishes KS1 with a L3 in reading (that's around 30% of pupils nationally last year). And as you may or may not know, a L3 is treated by the DfE as a secure level 3, i.e. a 3B (21 points). Now, under the old system of levels, a 3B was considered to be age-related expectations for the end of Y4. In the new curriculum, a pupil deemed to be at age-related expectations at the end of Y4 will have met all the learning objectives for that point in the curriculum. So, ask yourself this: has the KS1 L3 pupil done this? The answer is almost certainly no, which means you can't really continue to assign them a 3B. Instead they will have to be assigned a new sublevel; a translated value that reflects their position in the new curriculum, i.e. above expectations, but not 2 years above. Maybe a 2A. Who knows? 

In other words, they've apparently gone backwards.

Which is daft.

Some tracking systems have not helped matters by a) allowing users to continue with levels, and b) mapping new values back to old point scores and sublevels, implying there is a simple conversion. 

I suggest schools do themselves a favour: ditch levels now. You'll have to at some point anyway. Adopt a new assessment system and avoid the pitfalls that will inevitably arise by giving the impression of data continuity. A new system will not invite such comparison. You can start afresh. 

So, use your historical data to show progress and attainment up to the end of last year, and then start again this year. Don't attempt to measure progress across the old/new NC boundary by using end of last year assessments as a baseline. Instead create an early autumn assessment and measure progress from there. Concentrate on tracking percentages of pupils that are below, at and above ARE; hopefully showing increases in those at and above ARE as the year goes on. Individual progress comes down to books and the percentage of objectives met. That's pretty much all we can do at this point. Next year things get easier because you'll have a compatible baseline for more in depth and reliable analyses, but producing the 3 year progress data stipulated in the Ofsted guidance is not going to be easy. I just can't see how it can be done with any degree of reliability and I'm not sure they've thought it through. I suspect these issues will become increasingly apparent over the course of this year. 

And finally, I know that many tracking systems are not quite up to speed, and Ofsted make provision for this in the new guidance (see Ofsted handbook p63, para. 191). So, I'm not advocating throwing everything out but do make sure you ask the right questions of your supplier. It's fine to hold on (for a bit) whilst new versions are rolled out but make sure they have solid plans for assessment without levels (hint: they should have already!). It must be very tempting for established systems to stick as closely as possible to levels and APS because it requires a lot less redevelopment. But that doesn't necessarily mean it's right for schools.

Remember: the tracking should fit the curriculum, not the other way round. 

Good luck!

Friday, 5 September 2014

No attainment gaps please, we're intelligent.

If there's one phrase I hear in the course of my professional life that's guaranteed to result in my forehead colliding with a desk multiple times in quick succession (or wall if I'm standing up), it's 'attainment gaps', especially if it's preceded by the words 'can you provide me with some....' and followed by a question mark. I then punch myself in the face and run out of the room.

I just find it extraordinary that we're still dealing with these half-baked measures. Serious decisions are being made based on data that is taken entirely out of context. Recently the DfE sent letters out to secondary schools that had 'wide' attainment gaps between the percentage of pupil premium and non-pupil premium students achieving 5A*-C including English and Maths. Schools on the list included those where pupil premium attainment was higher than that of pupil premium students nationally, perhaps higher even than overall national averages. Meanwhile those schools that had narrow gaps because both groups had similarly low attainment rates, were left off the list. Bonkers!

On the subject of bonkers, I'm a governor of a junior school that failed to get outstanding purely because of the pupil premium attainment gaps identified in the RAISE report. This was despite us pointing out that their VA was not only higher than that of pupil premium nationally, but higher than non-pupil premium nationally, and that their VA had significantly increased on the previous year. So, the pupil premium pupils enter the school at a lower level than their peers and proceed to make fantastic progress but that was not enough. The gap hadn't closed. And why hadn't the gap closed? Because of the attainment of the higher ability pupils, particularly with the level 6 (worth 39 points) bumping APS even further. It would seem that the only way out of this situation is to not stretch the higher ability pupils (I don't advocate this). And anyway, there's a double agenda to close the gap AND stretch the most able. How does that work? Double bonkers!

And if there's an 8th circle of data hell (the 7th circle already occupied by levels) it should be reserved for one thing and one thing alone:

SEN attainment gaps

Oh yes! Everyone's favourite measure. Basically the gap between SEN and non-SEN pupils achieving a particular measure. These are obviously really useful because they tell us that SEN pupils don't do as well as non-SEN pupils (that was sarcasm, by the way). Often, they're not even split into the various codes of practice; just all SEN grouped together. Genius.

When I'm asked for SEN attainment gaps, I try to calmly explain why it's an utterly pointless measure and that perhaps we should be focussing on progress instead. And maybe we shouldn't be comparing SEN to non-SEN at all. Sometimes my voice goes up an octave and I start ranting. Usually this does no good whatsoever so I punch myself in the face again and run out of the room. 


Buried in the old Ofsted subsidiary guidance was the following table of despair:

It tells us that a 3 point gap indicates a year's difference in ability, and that a 1 point gap equates to a term's difference. Such language was often found in inspection reports; and this is despite stating that 'the DfE does not define expected progress in terms of APS' on page 6 of the same document.

But the times, they are a changing. It is encouraging that this table has been removed from the new handbook. I assume this means no more statements about one group being a year behind another based on APS gaps; and such data having a negative impact on inspections. Ultimately, I hope that the removal of the above table from Ofsted guidance sounds the death knell for this flawed, meaningless and misleading measure, but I won't hold my breath. I'm sure I'll have to punch myself in the face a few more times yet.

Wednesday, 27 August 2014

Attack of the Clones: are data people trying to replace levels with levels?

A couple of days ago the opening salvos of a war between good and evil were fired across the vast expanses of the Edutwitterverse. From their distant quantitative system of darkness, the number crunching legions of the Evil Empire of Analysis were positioning their Data Death Star in orbit around the peaceful and progressive Moon of Awol (that's assessment without levels, in case you didn't know); and much was at stake. 

Well, OK, there was a minor skirmish involving some words and some huffing, and some good points were made. Mainly, I have to confess, by Michael Tidd (light side), and not so much by me (dark side). Michael (follow him on twitter @MichaelT1979 - he knows stuff) has already convincingly blogged his side of the argument here. I also hurriedly wrote a piece detailing my thoughts but managed to lose the whole thing after 2.5 hours of effort. Typical. Try again!

So what's the big issue?

Well, to put it simply, the question is this: do we still need a numerical system for tracking now that levels have gone?

Which caused one person to ask: is this 'a case of the data bods driving the agenda'?

Whilst someone else worried that 'it's beginning to sound a lot like levels', which I'm fairly certain was a Christmas hit by Michael Buble.

They have a point. So, before I go any further I'd like to state the following:

1) I'm no fan of levels. They are too broad, sublevels are meaningless, and they have resulted in the most dangerous/pointless progress measure ever devised.

2) I don't believe a numerical system is required for reporting to parents and pupils. As a parent I am far more interested in what my daughter can do, and what she needs more support with, than some arbitrary number. 

3) I understand that assessment and tracking are different things.

So, do we still need a numerical system for tracking purposes? Well, I think we do. I fully support objective-based assessment models - they make perfect sense - but I also believe that conversion to a standardised numerical system will allow for more refined measures of progress, particularly at cohort or group level, and over periods longer than a year. To reiterate, these do not need to be used in the classroom or reported to parents; they would simply provide the numbers for analysis. They would be kept under the bonnet, fuelling the tracking system's engine; and this is the approach that most tracking systems have adopted. It remains to be seen, of course, how well these work in practice and whether schools start reporting these figures to parents and pupils. I hope not.

Ofsted and RAISE

So, this is where I have to make a confession: Ofsted concern me and many of my opinions about tracking and analysis relate to inspection, which is rather depressing I know, but I think it's necessary. No, I don't think we should build tracking systems solely to satisfy Ofsted but I think it's foolhardy not to consider them. Having been involved in a number of difficult inspections in the past year, I know that data presentation (particularly fine analyses of progress) can often make the difference between RI and Good, which again is depressing, but it's a reality. If we want to get an Ofsted-eye view of school data, just look at RAISE. If you want to counter the arguments stemming from RAISE then it pays to be able to present data in a similar format, in a way that an Ofsted inspector will find familiar. And let's face it: inspectors aren't going to find many things familiar this year. 

The measure that concerns me most is VA - a measure of long term progress, comparing actual against expected outcomes. Without resorting to any particular metric, we can address the proposed new floor measure of 85% making the expected standard by end of KS2 by showing the percentage of the cohort/group that are at or above the school's defined expectations linked to the new national curriculum. Mind you, to digress for a bit, I have a couple of issues here, too. Being at or above the expected level is not necessarily the same as being on track. The pupil may have made no progress or gone backwards. Also, if the school only defines the expected level at the end of the year, will this mean that all pupils are below expectations until the end of the year is reached, like a switched being turned on? Where will this leave the school? Would it not make sense to have a moving definition of expected to allow for meaningful analysis at any point in the year? Just a thought. 

Back to the issue of measuring progress, under various proposed assessment models, we can easily analyse in year progress by counting the steps pupils make, and we can also measure progress by monitoring the shifts in percentages of pupils in a cohort that are below, at or above expectations. But long term, VA-style progress measures are more tricky. If no numerical system exists, how does the school effectively measure progress to counter any negative VA data in RAISE? I'm really struggling with this and I suspect that many if not most headteachers would like their assessment system underpinned by a numerical scale, which will allow progress to be quantified . We know that a floor standard, measuring progress from beginning of EYFS to end of KS2, will be implemented and will be of huge relevance to schools, the majority of which will (initially at least) fall below the 85% expected standard threshold mentioned above. I'm assuming that schools will want to in some way emulate this VA measure in their tracking by quantifying progress from EYFS to the latest assessment point, and perhaps in some way project that progress to make predictions for the end of the key stage.

Another confession: I made the assumption that these assessment models rely on sequential attainment of objectives. If this were the case then a decimalised curriculum year-based model would be useful and neat. For example, categorising a pupil as a 4.5 because they are working within the year 4 curriculum and have achieved 50% of its objectives. Simple. And of course would allow meaningful comparison between pupils within a cohort and even between schools. However, as was pointed out to me, this is not how pupils learn and it doesn't tell us what 50% they've achieved (it's not necessarily the first 50%). This was what we were debating yesterday when the 'data bods driving the agenda' accusation was fired at us. The author of that comment has a good point. 

However, in my defence - and I'm sure it's the same for most other data people - I don't want to drive the agenda. I spend most of my time in schools, working with headteachers, senior leaders, teachers and governors, and I'm constantly learning. I change my mind pretty much everytime I look at twitter. My opinion is like Scottish weather: if you don't like it, just wait 20 minutes. I simply want to ensure that schools have the best tools to do their job and to defend themselves against Ofsted. That's it. I'm not interested in unwieldy, burdensome, time consuming systems; data systems should simplify processes, save time and improve efficiency. It should be a servant, not a master. And yes, its primary function is to inform teaching and learning.

So, to summarise a rather rambling blog, I'm excited about the removal of levels and see it as an opportunity to innovate. As a parent I am more interested in knowing what my daughter can and can't do, than her being assigned a meaningless level. I just think that tracking works best when data is converted to a standardised numerical system. This numerical scale should be used for strategic analysis, to help senior leaders compare current school performance against that outlined in RAISE. I don't think that new numerical systems should replace levels and be used for reporting purposes.  Any such systems must be kept guarded within the mainframe of the Data Death Star at all times.

and we'll leave those cute little Awols alone. 


Data Vader
Level 5 (Sublevel C)
Data Death Star

Wednesday, 20 August 2014

Using on entry CAT tests in junior schools (and how I intend to buy new climbing shoes)

Some things in life are certain: death, taxes, getting a 'sorry I missed you' card from the postman when you've just nipped to the loo for 2 minutes. Oh, and having the conversation about the accuracy of infant schools' KS1 results whenever you find yourself in the same room as a junior school headteacher. This is a conversation I have regularly. If I had a pound for each time I've had this conversation, I reckon I'd have about £87 by now, which is nearly enough for a new pair of climbing shoes. I always need new climbing shoes.

I'm going off topic.

Sometime ago, a junior school head came to visit me in my office. She wanted to discuss the issue of KS1 data accuracy (obviously). I pushed my jar of pound coins towards her, strategically placed a climbing gear catalogue within line of sight, and prepared myself for some proper headteacher ranting. But this head didn't want to rant; she wanted to take some action. She wanted to do stuff. She wanted data. Which is always nice.

So, after some discussion we hatched a plan: to carry out CAT tests on entry in as many Junior schools as possible. We had no idea if this project would be of any use and what we would do with the data when we got it but it sounded like positive action and we thought it would be pretty neat, too. In the end after numerous meetings and emails, 13 out of the 20 junior schools in Gloucestershire got involved and a date in early October was set for their new Year 3 intakes to do the tests. Exciting!

The test itself is known as a PreA test and is specifically designed to be carried out early in year 3. If you'd like to learn more about these and other CAT tests, please contact GL Assessment.

I said above that we didn't know what we would do with the data, which is really true. I had a sort of, kind of idea. A CAT test provides scores for the pupils verbal, non-verbal and quantitative reasoning; it does not generate a level or sublevel that can be directly compared with the pupil's KS1 results. However, like other CAT tests, the PreA test would provide an English and Maths estimate for the end of KS2 in the form of a sublevel. I thought it would be interesting to compare these estimates with those generated using RAISE VA methodology. Not exactly a perfect solution, but compelling, in a data-ery sort of way.

So, once the junior schools had carried out the PreA tests in October last year, they sent me the data. I then converted each pupil's KS2 sublevel estimates generated by the tests, into points scores (by the way, I don't like using the term 'APS' here because they're not averages. I'm pedantic like that). Next I put each pupil's KS1 results into my VA calculator (more information on that here) to generate end of KS2 estimates using RAISE VA methodology, and took estimated point scores for each pupil. I now had two point score estimates for the end of KS2 for each Y3 pupil in the 13 junior schools taking part: one based on the CAT PreA test; the other based on their KS1 results. Neat! now all I had to do was subtract the CAT estimate from the RAISE VA estimate (the former from the latter) to find which one was highest. Positive figures would indicate that the estimate derived from the CAT tests was in advance of those derived from KS1 results; negative figures would indicate the opposite. 'So what?' I hear you shout. Fair question, but bear in mind that it's the RAISE VA estimate that the pupil's progress is measured against (well, sort of, because, actually, their estimates won't really be calculated until they've done their KS2 SATS, but we're trying here, OK?). And if the RAISE VA estimate (i.e. that based on KS1) is always higher that the CAT estimate then this could be rather worrying as it may indicate that the future VA bar will be set unrealistically high for those pupils.

So what was the outcome?

Well, the estimates based on KS1 results were higher than the those based on the CAT test in pretty much every case. I'm writing this at home without the full dataset in front of me but we're talking about approximately 600 pupils here. It was quite startling. Wanna see some data? Course you do.

English Maths
Junior School 1 2.3 1.9
Junior School 2 1.6 1.9
Junior School 3 4.3 4.0
Junior School 4 2.7 2.4
Junior School 5 3.3 1.8
Junior School 6 2.7 3.2
Junior School 7 2.6 3.2
Junior School 8 3.3 2.3
Junior School 9 6.0 6.9
Junior School 10 2.5 2.1
Junior School 11 4.3 4.9
Junior School 12 2.3 1.6
Junior School 13 1.5 1.1
Average 3.0 2.9

The table and chart above (it's always nice to have the same data presented in different ways - I learnt a lot from RAISE) show the average differences (this actually is APS!) between the end of KS2 estimates derived from CAT PreA tests and those generated using RAISE VA methodology for both English and Maths. I used 2012 methodology, by the way, as it produced English estimates, rather than the separate reading and writing estimates of 2013, and so matched the CAT test data. As you can see the average difference for the group of schools is 3 points for both English and Maths, i.e. VA estimates base on KS1 outcomes are 3 points (1.5 sublevels) higher than those based on the CAT tests. Some schools' differences are very small (e.g. schools 2 and 13), so estimates based on KS1 and CAT tests are similar and this could be taken as evidence that KS1 results are accurate. And maybe differences of 2 APS or less are the within the limits of tolerance, but three of the above schools (3, 9 and 11) have very big differences and these perhaps are the most concerning. Schools 3 and 11 have differences of 4-5 APS (2-2.5 sublevels) and school 9 has a difference of 6 APS in English and 7 APS in Maths (an entire level).

Obviously I'm making the assumption that CAT tests are reliable and accurate predictors of end of key stage outcome, but if this is the case (and many evidently think they are), and if the estimate differences detailed above can be taken as a proxy for the gap between KS1 results and pupils' actual ability, then the children in these three schools in particular have some serious ground to make up just to break even in terms of VA. Considering that, on average, cohorts need to make around 13 points to get a VA score of 100  (it's actually around 13.4 but let's not split hairs), then the pupils in the schools 3 and 11 would, in reality, need to make 17 points to make expected progress (in terms of VA). Meanwhile pupils in school 9 will need to make 19-20 points to reach the VA 100 line. Somewhat unlikely and blue boxes in RAISE may be hard to avoid. Interestingly, my friendly junior school head teacher, mentioned above, maintains that pupils in her school need to make 16 points of progress in reality (i.e. from the school's own baseline assessment) to get a positive VA score. The CAT vs VA experiment backed up her assertions.

So, that's it really. Deeply flawed I know, but interesting and a worthwhile exercise (the data was used by one school as part of their evidence base for inspection and proved very useful). The lack of control group is an obvious issue here and needs to be addressed in future. Ideally we'd like to get 10 primary schools to take part at some point. Traditionally schools have carried out CAT testing in Year 5 but more schools are considering alternatives. I actually think it's worth doing them earlier as you have more time to act on the data, so perhaps more primary schools would be interested in testing in year 3. Many of the junior schools heads involved in this project intend to continue using the tests as it gave them a alternative and rich source of information on pupils strengths and weaknesses, which they didn't have previously. This is a positive thing.

And finally, please can I state that this is not intended to be an exercise in infant school bashing. I'm very fond of infant schools, some of my best friends are infant schools, but this issue always crops up when talking to junior schools so I thought it would be interesting to test their claims. I suspect that similar issues occur in primary schools and that's why we need a primary control group for this research to have any real validity.

Anyway, that's the end of this blog. Hope it was useful, or at least interesting.

Oh, and by the way, I am now a governor of a junior school and now own new pair of climbing shoes.

Wednesday, 23 July 2014

How to calculate L4+RWM

It's that time of year again. I should have written this a few weeks ago when the results were made available on NCA tools but haven't had a chance, so sorry about that. 

Right, first of all, let's bust a few myths and kill some (very) bad practice. This is how NOT to calculate L4+RWM:

1) Do not assume it is the lowest of the three percentages achieving L4 in the individual subjects. It might be; it might not, and

2) Never, EVER calculate the mean average of the three percentages for L4 in the individual subjects. 

The latter is the most heinous data crime I can think of. If you do this, stop doing it. Stop it now.

If you are struggling to understand why, then imagine a cohort of 9 pupils, of which 3 achieve L4 in reading, 3 achieve L4 in writing, and 3 achieve L4 in maths. Using both the above 'methods', the answer would be 33%. But what happens if we are dealing with a different 3 pupils in each subject. Then the percentage achieving L4+ in reading, writing and maths is 0%. To calculate L4+RWM you have to start at pupil level and work up.

So, how to calculate L4+RWM. First, log onto NCA Tools and download the results summary file. The spreadsheet will have columns for pupil name, gender, DoB, and levels for reading, GPS (SPaG), and maths. You need to delete the GPS levels (they are not used in this calculation) and replace them the writing levels. You can cut and paste from another source but ONLY after you have made sure it's in the same order. Might be best to manually type them in. Also, change the column heading to 'writing level' so it makes sense when you look back at it.

You should now have a spreadsheet with levels for reading, writing and maths for all pupils. If any pupils have an N or a B you should replace this with 0,1 or 2. It's not that important which as far as the calculation is concerned, as none are L4+ anyway.

If any data is missing then this will affect your unvalidated RAISE. Changes should take effect in the validated RAISE, which be published in Spring 2015. So, if you want to calculate the true (final) L4+RWM figure, put their TA levels in; if you want your unvalidated RAISE figure then change them to 0. Probably best to make a copy and do both so you know where you stand. 

Now, click in the cell in the first blank column, to the right of the maths level column; and, for the first pupil, enter the following formula:


Assuming the levels for reading, writing and maths appear in columns E, F and G respectively. Please check this and modify column references in the formula accordingly.

Then copy the formula down to the last child. You can either copy the cell, select all the remaining cells, and paste. Or simply grap the bottom right corner of the cell containing the formula, when highlighted, and drag it down to the last pupil.

The formula will assign a value of 1 to those pupils with L4 or above in each of the individual subjects, and a 0 to those that don't fulfill the criteria, e.g. are L4 in reading and writing but L3 in maths. If the pupil is L5 in all subjects they will get a 1; if they are L5 in two subjects and L3 in another they will be assigned a 0.

Then you just need to calculate percentage of pupils with a 1. You can count up the pupils with a 1 and divide by total number of pupils, or you can use a formula such as:


This will sum the 1s (i.e. those pupils that meet the criteria) and divide it by the total number of pupils. 

Please note: H2:H30 in the above formula is the range. It is an example. It assumes our 1 and 0 data exists in column H and that pupils range from row 2 to row 30 (i.e. there are 29 pupils). It may be that your formula is in a different column, and chances are you have more or fewer pupils, so please don't just copy the above formula. Adjust it to take account of the column references and full range of pupils, or it won't work.

You should now have a proportion figure in the cell containing your average formula (e.g. 0.86). To convert to a percentage simply click on the % button in the menu bar.

Finally, if you want to calculate L5+RWM, then start again in a new column and change the initial formula to:


And repeat the process.

A word of warning: if you copy and paste the 1/0 formula into another column to calculate L5+ then the column references will change (unless you've fixed the position using the $ sign - worth learning how to do this if you don't know already). For this little project, probably best to just retype the formula from scratch, changing the 4 to 5, as described above, and keeping the column references the same. 

So, that's it: how to calculate %L4+RWM. Hope it's useful.