Numbers increasingly govern public services. Hospitals monitor waiting times, bed capacity, patient satisfaction – to name just a few. Universities are ranked by student satisfaction and prisons account for inmates’ time out of their cells. This growth in performance measurement and how it shapes public policy is the focus of the Quantification, Administrative Capacity and Democracy(QUAD) project by five academic teams across four countries, led by Andrea Mennicken (LSE) at carr.
“The starting point of this research project was the observation that the role of performance measurement in the public services has really taken off, not just in in the UK but also more widely across Europe, so that public management is often done with a calculator to hand”, explains Dr Mennicken.
“We wanted to trace the rise of quantification and how different instruments of quantification and performance measurement have travelled across different public sectors, but also across countries.”
The multidisciplinary research teams looked at three public service sectors, healthcare (hospitals), higher education (universities) and criminal justice (prisons), in France, Germany, the Netherlands, and the UK (focusing on England), and how the rise in data and performance measurement has developed and changed in these sectors and countries over time.
Though a reliance on governance by numbers is not new, the past three decades have seen a shift in the scale and purpose of quantitative data, affecting how public administration is organised, and often determining the assessment and rationing of services. This growth in quantification since the 1980s occurred alongside public sector reforms that focused on introducing private-sector style competition into the public sector (“New Public Management”).
England’s public services and the growth in performance measurement
“Often it is said that the core of the New Public Management reforms lies in the UK. From there, it travels to other European countries, and we wanted to compare and contrast to what extent we find similarities, and where we find differences.”
As a first step, therefore, Mennicken and colleagues created a database tracking the different performance indicators required for hospitals, prisons, and universities in England (and Wales, where applicable). Starting in 1985, they took a snapshot of measurements collected at 10-year intervals up to 2015.
They found that, firstly, in all three sectors the number of performance measurement indicators required increased substantially, with more and more measures added, often in response to a particular crisis (a hospital failure or a prison riot, for example). However, when new indicators were created, old indicators frequently weren’t deleted, so often information is still collected even if it is no longer used for performance evaluation. Over time, this creates a huge increase in the amounts of data gathered, which poses challenges for the storing, processing and acting on such data, and can contribute to the creation of an unwieldy bureaucracy given the increasing number of new information systems that need to be managed and fed.
The nature of this data and its purpose has also shifted. “What we see in all three sectors is a move from an initial focus on costs in the 1980s, and financial measurements, to a growth in measurement of performance outputs,” explains Mennicken. This includes student satisfaction and prisoner quality-of-life measures. This then shifts again to outcome measures, such as student employment and re-offending rates.
This development and expansion of performance measurement was aimed at increasing transparency and accountability, underpinned by a moralising incentive for improvement through quantification (though this was later undercut by austerity). For example, in the 1970s and ’80s, very little information was available about life inside individual prisons, and so the introduction of performance measures was motivated in part by a desire to understand prison experiences better. The first key performance indicators – such as time spent out of cell and whether prisoners completed educational programmes – though “flawed and very rough and ready”, explains Mennicken, at least provided some information about what went on within a prison. “To just get more insight and data is quite important to empower prisoners, on the one hand, but also, of course, to inform policy-making and enhance accountability.”
However, these systems of measurement necessitate drawing boundaries around the units of evaluation. Data will be collected and analysed for each hospital or health trust or prison, for example, to aid comparison.
This creates a particular bias. “In terms of performance measurement design, the issue is, if you have an organisational entity focus to stimulate performance and competition amongst prisons, for example, or hospitals, that can undermine delivery at a systems level, if cooperation and collaboration is not accounted for in the measurements.”
It is difficult to attribute re-offending rates to individual prison performance, for example, as prisoners are regularly transferred between prisons, transcending prison boundaries and the accounting for them. Rating individual prison performance – rather than the service as a whole, and without covering the interaction with the probation service – can therefore create distortions.
Comparing across countries: prisons in Germany and England and Wales
Expanding the focus beyond the UK, the QUAD project also examined the rise and spread of quantification in comparative perspective. Comparing, for instance, England and Wales’s prison system to Germany’s (Iloga Balep, Guter-Sandu, Mennicken & Huber, forthcoming), the project teams found that there were significant differences not only in the data collected and units of measurement, but also its purpose and underlying rationale. Data for England’s prisons are aggregated and standardised by the national prison and probation service to create a performance measurement framework that seeks to make individual prison performance – public and private – comparable.
In Germany, there is state-level autonomy in data collection, with most measures focused on state-level financial and operational planning (state-level in this case refers to the 16 federated states of Germany). Though the researchers identified a move towards collecting individual prison performance data in recent years, it is a much more fragmented picture and less information is publicly available. Primarily, performance measures tend to be used in criminological research rather than for the purposes of central steering and management.
By comparison, the researchers note that, “given that performance measurement is so pervasive in the Prison Service of England and Wales, it comes as no surprise that it is one of the main tools for political control and accountability. Prisons that perform poorly can be easily identified through the prison ratings and be required to provide explanations and routes to improvement. Also ministers have an incentive to keep an eye on the prison ratings, as these can become the source of media attention.”
This system of standardised ratings, which becomes integral to political oversight, means there is increased competition and there are strong incentives for improvement by prisons, but also unintended consequences such as “gaming, disengagement, and bureaucratisation” – whereby individual prison authorities focus on measures to improve their ranking, rather than substantial or system-wide analysis and reform.
For policy-makers, initial implications of QUAD’s research include being mindful of erecting accounting entity boundaries when designing measurement systems, which create unintended consequences in how policy is implemented to meet certain targets.
The other insight from this comparative research is that, even though measurement is often introduced to enhance accountability, this can undermine learning and reflexivity. Actors have an incentive to game the measurement system and to enter into blame shifting. As Mennicken notes:
“What we often overlook is that performance measurement can be useful, not just for the purpose of control, but also for the purpose of learning and information gathering, which is not as directed in terms of behaviour modification. So the de-linking of accountability from measurement may also be useful.”