Sunday 13 September 2015

Bias in the Work Capability Assessment: Analysis of Results of 1,000,000 WCAs

Update 14/09/2015: I've added a bit at the bottom.

Abstract

The Work Capability Assessment (WCA) of Employment and Support Allowance (ESA) is the legal test by which eligibility for ESA, the UK’s social security benefit for those unable to work due to illness or disability, is determined. This is an update of my previous analysis, including an extra 200,000 results, of Incapacity Benefit claimants being migrated over to ESA. All the data is from freely-available government datasets. I show that there is significant bias towards the poorest and most disabled in the WCA and this must be urgently addressed.

Introduction
In 2013 and 2014, I conducted a statistical analysis of the outcomes of the Work Capability Assessment for Incapacity Benefit Claimants being migrated to Employment and Support Allowance on my old blog. I showed significant relationships between WCA judgements and variables like life expectancy and local rates of poverty, and interpreted this to mean that the WCA is biased against people in poorer or less healthy areas.

Since that analysis, I’ve learned that the United Nations are investigating the UK for “grave and systematic” abuses of the human rights of disabled people; a group in which I am included. What better time to update that analysis?

There are numerous variables which ESA should logically be related to.  The most immediately obvious is local rates of disability and health. There should be a higher proportion of people being found fit for work in healthier areas, with lower disability. Conversely, there should be a higher proportion of people going into the Support Group in areas with lower life expectancy and higher rates of disability.

There have been accusations that the WCA is not a valid assessment of disability. If this is the case, there will be relationships with variables other than disability. In the UKs current political climate, cutting government expenditure has been touted as one of the key goals of both governments since 2010, when ESA began being rolled out on a wide scale. As the majority of people getting ESA are poor (and get income-based ESA), I have chosen to test relationships with financial deprivation and, at the suggestion of an ESA whistleblower, I have chosen to test the relationship between WCA outcome with educational attainment; their suggestion was that people with better educations are more capable of getting into the WRAG and Support Groups.

Method

I collated data from various government sources (full details are in the data provided) including Public Health England and the Department for Work and Pensions. I designed the dataset to show relationships at the Local Authority level as this was the most consistent level of abstraction used amongst the various sources of information. I chose to combine the deprivation variables into a composite measure, averaging out the Local Authority residents in the 20% richest areas of England with the 20% poorest areas of England. In this composite measure, a lower score means lower rates of deprivation.

I then conducted statistical analysis using SPSS 20 and Microsoft Excel.

Descriptive Statistics

There is a great deal of variance between areas in the proportion of claimants that are put into each WCA category as shown in Table 1. In some Local Authorities, almost 50% of people are being judged fit for work compared to 25% in other areas.


% Fit for Work
% WRAG
% Support Group
Mean
18.95
37.17
43.88
Standard Deviation
3.57
4.177
5.12
Minimum
10.87
24.72
33.24
Maximum
46.63
46.63
58.25
Table 1: Descriptive statistics for WCA Outcome rates

Correlations

I first chose to test whether there were relationships between WCA outcomes and the variables of interest to determine if it was worth conducting a deeper analysis, and this is shown in the correlation matrix (table 2) below. All results are statistically significant (p=<.05) and N=324. All tests are two-tailed.


Life Expectancy
% with moderate or severe disability
Deprivation
% with 5 A*-C GCSEs
% Fit for Work
-.559
.315
.501
-.233
% WRAG
ns
.314
ns
-.239
%Support
.437
-.475
-.340
.358
Table 2. Pearson’s r correlation matrix of WCA outcomes with important variables. NS = not significant.

These correlations are illustrated below in Images 1, 2 and 3.


Image 1. WCA Outcome vs local life expectancy.

Image 2. WCA outcome vs local educational attainment

Image 3. WCA outcome vs local deprivation


Interim conclusion

It is clear that there is a complex relationship between the variables, given that the relationship between the test variables and WCA outcomes changes. However, it is also clear that the outcome of the WCA is not purely dictated by the levels of disability in an area; if this was the case, the proportion of claimants going into the WRAG and Support Groups would be positively correlated with the proportion of people with moderate to severe disabilities, and the proportion of people being found fit for work would show a negative correlation. This is not the case, and as the results show, the more disabled people there are, the fewer people are going into the Support Group and more are being found capable of work and having their benefit eligibility withdrawn.

Given the complex interrelationship between the variables, I have chosen to conduct three stepwise linear regressions, one for each WCA outcome. This will statistically control for the effect that, for example, poverty has on health, and will give us a clearer picture of what the true relationship between the variables is.

Regression Analysis

Fit For Work judgements

In this regression, both deprivation and the rates of severe disability contributed to a statistically significant model (R=.511, F(2,321)=56.595, p<.001). As deprivation increased, the rate of fit for work judgements also increased. As the rate of severe disabilities increased, the rate of fit for work judgements increased also.

Work-Related Activity Group judgements

In this regression, the proportion of people with moderate disabilities, the proportion of those with 5 A*-C GCSEs and local deprivation levels all contributed to a significant model (R=.408, F(3,320)=21.341, p <.001).

As deprivation and the rate of GCSEs decreased, the proportion going into the Work-Related Activity Group also decreased. However, as rates of moderate disability increased, so did the rate of people going into the WRAG.

Support Group

In this regression, both the proportion of people with severe disabilities and the proportion of those with 5 or more A*-C GCSEs were contributors to a significant model (R=.515, F(2,321)=57.911, p<.001).

As the rate of severe disabilities increased, the proportion of claimants in the Support Group decreased. However, as educational attainment increased, the proportion going into the Support Group also increased.

The regression analyses are summarised in Table 3, below.

WCA Outcome
Variable
B
Std. Error
beta
t
Sig.
Fit for Work
Constant
13.610
.754




Deprivation
1.497
.226
.419
6.611
<.001

Severe Disability %
.234
.117
.127
2.00
.046







WRAG
Constant
36.198
3.096




Moderate Disability %
.981
1.63
.331
6.030
<.001

GCSE %
-.114
.035
-.190
-3.238
.001

Deprivation
-.704
.235
-.169
-2.996
.003







Support Group
Constant
44.411
2.960




Severe Disability %
-1.077
.139
-.406
-7.735
<.001

GCSE %
.140
.039
.190
3.615
<.001
Table 3. Final regression models and statistics for each WCA outcome

Discussion

It is clear that there is a complex and counterintuitive relationship between WCA outcomes and variables which should logically be related and others that should not be related. If the WCA were a fair and accurate assessment of disability, the proportion of people going into the WRAG and Support Groups would be positively correlated with the rates of moderate and severe disability in the local area. This is not the case as shown in Images 1, 2 and 3.

Image 4, below, further illustrates the relationship between economic deprivation and WCA outcomes. It is apparently that fit for work judgements occur more frequently in areas a higher proportion of their population living in the 20% poorest LSOAs of England, while the inverse is true of areas with fewer fit for work judgements than average.

Image 4. Proportion of people in poverty in areas with above or below average proportions of fit for work judgements.

Fit For Work Judgements

That fitness for work judgements decrease when the rate of severe disabilities in an area increases is a strong indictment of the WCA. If the test was valid, it would be expected that this relationship would be reversed. It is logical that, in areas with higher rates of disability, there would be fewer people found able to work. That this is not the case raises concerns about the validity of the WCA.

Furthermore, this decision is influenced by the affluence of the claimant’s area. It would be expected that ESA claimants are poorer than the general population because of the means-testing component but also because disabled people are generally poorer. However, this should have no effect on the outcome of the WCA; all these people are recognised as disabled and pass the financial or contributory criteria required to even undergo the WCA. It is a great concern that people in poorer areas – where social security benefits contribute more to the local economy – are facing more stringent or less fair WCAs than people in richer areas.

The ESA Groups

Once a claimant has had their WCA and hasn't been found fit for work, they are placed into either the WRAG or Support Group. What is most concerning here is the pattern of significant findings; it would be expected that the WRAG, for claimants “capable of work-related activity” but incapable of work, would be correlated with the proportion of moderate disabilities. However, the Support Group, for claimants “incapable of work-related activity” should be positively related to the proportion of severe disabilities, but is not. Instead, areas with higher rates of severe disability have fewer Support Group judgements.

Furthermore, educational attainment has a significant effect on both WRAG and Support Group judgements. Higher educational attainment in a local area correlates with more Support Group judgements, but the opposite is true of the WRAG. This suggests that the WCA outcome is somewhat influenced by the claimant’s experience or skill at navigating the bureaucratic structures of the Department for Work and Pensions, such as fully understanding all the questions of the ESA50 medical questionnaire. This is a variable that should have no effect on an objective and valid assessment of disability, and that it does is a concern, particularly considering how many disabilities start in childhood and the direct and indirect effects on education.

Limitations

As with all investigations, this one has limitations. Firstly, the data I have used is local averages, and this limits the conclusions that can be drawn. Were a similar investigation completed with access to full details of ESA claimants and WCA (points, claimants age and education, etc) it would be easier to see particular biases. Local Authority averages were all that were publicly available.

Secondly, this study has looked at some nebulous concepts (e.g. “health”). The use of surrogate variables is common, but is something to bear in mind when interpreting the data. For example, a drug to prevent heart attack might be tested on its ability to lower blood pressure, and from that it would be inferred it could prevent heart attacks. The rate of heart attacks is inferred from a third variable. This is what I have done with, for example, education. I am inferring the level of education in an area by the rate of high GCSE performances.

Conclusions

The Work Capability Assessment is simply not a valid assessment of disability and is unduly affected by variables which should, in theory, have no impact on the results. This is a source of great concern for disabled people. However, improvements to the method by which successful claimants are assigned to the WRAG or Support group has become more important since the Budget, when it was announced that new claimants placed in the WRAG would receive 30% less money, further impoverishing this already disadvantaged group.

As for the WCA itself; there are three key points raised in this analysis:

1.       The poorest are the least likely to get onto ESA at all, as areas with higher socio-economic deprivation have a higher rate of fit-for-work judgements.

2.       The assignment of claimants to the ESA groups has no relationship to the severity of the disability the claimants experience, as severe disabilities reduce the probably of a Support Group finding but also increase the probability of a fit-for-work judgement

3.       Educational attainment should play no role in the WCA, but higher educational attainment in a local area increases the number of Support Group judgements, which are preferred by Claimants for their lower conditionality and larger payments.

I believe I have shown that the Work Capability Assessment is biased, particularly against the poorest claimants who are more likely to ‘fall at the first hurdle’ and be found fit for work. However, the WCA also shows bias against the most severely disabled who it should be helping the most. That it can have such a relationship with educational attainment is alarming, and again this suggests a further bias towards those with better educations.

As others have said, the WCA needs an urgent overhaul or to be scrapped in its entirety. This analysis shows why: by favouring richer, healthier areas with better access to education, the WCA is reinforcing the systemic disadvantages faced by disabled people.

Extras

It's hard to get across how important this is using just numbers; a lot of people don't quite grasp that, behind each number, there's a person. That there's a correlation of r=.55 between two variables, with a p value of less than 0.000000000001 (literally!) means a lot to me, but not much to others, so here I'll be adding a few more graphs and such as I add new variables or come up with ideas that might help illustrate what's going on above.

My data is available here, and is colour-coded to show sources.

Here are two graphs; one shows the rate of early deaths in areas that most or least frequently give fit for work judgements.


The second shows the proportion of people living in either the 20% poorest or richest areas depending on whether that area gives amongst the least or most Fit for Work and Support Group judgements: In areas where there's the highest rate of fit-for-work judgements, over 35% of people are in the poorest areas, vs only 5% in areas with the lowest rate.

Deprivation

I've added a new variable to make the relationships with socioeconomic deprivation a bit easier to understand; in the future, I'll be talking about the proportion of local households with two or more indices of deprivation (such as low income, low access to good healthcare, and so on).

Targets

@BendyGirl, over on Twitter, raised the possibility that the relationships could be accounted for by targets used to control how many judgements WCA assessors can make. The short answer is: I'm not sure! It's not really possible to answer with the data available, but some of it is consistent with the idea of targets so they may well be a source of the bias I've found.

This graph, using the new definition of deprivation, is a good example.


What we see is WRAG judgements staying approximately the same, while Support Group judgements go down as poverty increases and fit for work judgements go up. This is consistent with the 'targets' hypothesis.

Assume an HCP is only allowed to make 1 Support Group judgement out of every 10 they make. Now imagine they're in an area where there are going to be a lot of claimants. They have no idea which conditions are going to come through the door next; they're pressured by the system to save that one Support Group allocation for someone who "really needs it".

They're therefore more inclined to place claimants into the WRAG regardless of how severe their disability is, in case someone worse comes next, but there's a target for the WRAG too! The WRAG fills up with people that should be in the Support Group, and there's nowhere for people who should be in the WRAG to go; they're found Fit for Work.

This scenario is consistent with the graph above, and the fact that the proportion of Support Group judgements decreases (and FFW increases) as the number of WCAs increases significantly, albeit weakly and with poor fit (Support Group r2=.032). If targets were having a strong role in the bias I've found, I would expect the r2 to be much higher. This is an example of a statistically significant result but one of dubious real-world relevance.



However, it's important to note that this data only includes the 'highest-level' decision. If the claimant went to appeal, then the tribunal's decision is the one recorded. Likewise, if the claimant hasn't had a formal decision by the DWP, then the HCPs decision is the one recorded. It's possible that targets are playing a minimal role, as tribunals are (to my knowledge) not subject to them; any bias introduced by targets at the HCP and DWP level would be would be undone by the tribunal decisions assuming the majority of people affected appeal the decision (as these were IB claimants being migrated to ESA, I assume most would appeal!)

It's also worth noting that the correlation between the proportion of Support Group judgements and the number of WCAs done is non-existent when controlling for deprivation (r=.047, p=.402) - if targets were having a large effect, I'd expect there to be a correlation; more WCAs done mean the assessors are more likely to reach the targets and have to start putting people into the WRAG instead of Support Group, so the proportion in the Support Group would go down. This isn't the case, as the relationship between the number of WCAs done and the proportion of judgements can be explained by local poverty - areas with more poverty do more WCAs as they have more ESA claimants.

The relationship between poverty and WCA outcome isn't accounted for by the total number of WCAs done, however. Together, this is evidence against the hypothesis that the observed bias is caused by targets or "norms".


1 comment:

  1. Thanks a lot for this! What are you thoughts about the role of the assessor in this process? Much is made about the computer assessment program, but how important is the assessor's role (I.e. doctor's role) in providing the kind of crucial and often more nuanced information needed for the powers that be to make a more accurate decision? Can doctors bear more weight in th process by accounting for more discrete observations in anecdotal comments?

    ReplyDelete