Background
Introduced in 2004, the United Kingdom’s Quality and Outcomes Framework (QOF) is the world’s largest primary-care pay-for-performance programme. The QOF links up to 25% of general practitioners’ income to performance on a wide range of quality indicators related to clinical management of common chronic conditions, organisation of care and patient experience [
1]. This supplements existing payments to practices, which are largely provided through capitation payments. Research on the QOF suggests that the programme accelerated improvement for the incentivised indicators in the 3 years following its implementation [
2]. However, this improvement appeared to attenuate over time [
3‐
5]. A recent analysis also found that the QOF did not significantly improve mortality for disease areas targeted under the programme [
6].
The QOF is subject to annual review, with changes agreed in negotiations between National Health Service (NHS) Employers and the British Medical Association’s General Practitioners Committee, informed by indicator development work conducted by the National Institute for Health and Care Excellence. In 2014/15, 40 indicators—accounting for 35% of the value of total incentive payments—were removed from the scheme without replacement, with most of the associated resource used to increase capitation payments [
7]. In 2016/17, QOF was discontinued altogether in Scotland and funding was transferred to capitation payments [
8]. QOF continues in England, Wales and Northern Ireland, although options for reform or replacement are being considered.
Despite the large costs of the QOF and other pay-for-performance programmes, almost no evidence exists on their cost-effectiveness and how this compares to other system-level interventions to improve longevity [
5,
9,
10]. Pay-for-performance programmes introduce additional economic costs to the health-care system, which could have been spent on other health interventions or policies. A cost-effectiveness analysis, the standard method for assessing value for money, can be used to determine whether additional spending on pay-for-performance is worth the health gains produced by these policies. Although a cost-effectiveness analysis has previously been conducted for other pay-for-performance programmes in the UK, decisions on the development or discontinuation of QOF have not been informed by reliable estimates of cost-effectiveness [
11]. A systematic review of pay-for-performance programmes found that, despite the promise of cost-effective financial incentives, convincing evidence of the cost-effectiveness of pay-for-performance was lacking [
12]. Previous attempts to estimate the cost-effectiveness of pay-for-performance have extrapolated evidence from randomised trials, rather than using direct evidence of its effectiveness on outcomes [
4,
9,
13]. These approaches are limited because results from randomised trials may not generalise to the older, sicker patients who are typically excluded from trials [
6].
In this study, we address this knowledge gap by evaluating the cost-effectiveness of the QOF under various assumptions around the programme’s benefits and costs.
Discussion
In this study we modelled the cost-effectiveness of continuing the QOF in the UK to evaluate whether the incremental health gains from continuing the pay-for-performance programme would be worth the additional costs to do so. We found that the ICER for continuing the QOF was £49,362/QALY, with an 18% probability of being cost-effective in probabilistic sensitivity analysis using a cost-effectiveness threshold of £30,000/QALY. This estimate was robust to variation in assumptions related to non-fatal outcomes, increased drug costs and waning of benefits from the QOF. A probabilistic sensitivity analysis found that the QOF was cost-effective in only 18% of the scenarios tested. We found that ICERs of the QOF were substantially more favourable only in those scenarios where the QOF was associated with large reductions (beyond our base-case estimates) in (1) costs associated with averted health events or (2) non-fatal cardiovascular disease events. The estimated population opportunity cost of continuing the QOF (in terms of incremental net health benefit) was 226,109–979,917 QALYs lost.
Our estimate of the ICER for the QOF is above the conventional threshold of £20,000–30,000/QALY that is used to determine cost-effectiveness in the UK. This suggests that primary-care pay-for-performance in the United Kingdom has not been a cost-effective strategy to improve health. Nonetheless, our base-case analysis treats QOF incentive costs as incremental to the health system. We assumed that stopping the QOF would return all incentive payments to the NHS. However, if the NHS decided to stop the QOF and return all or some of the QOF payments to providers as increased capitation payments, as has already happened in Scotland, this would maintain the costs for the QOF while losing the benefits (unless the benefits from the QOF remain or wane over time after the financial incentives are stopped). Relative to this scenario, continuing the QOF would be more favourable. In sensitivity analyses, we found that continuing the QOF would be cost-effective (ICER < £30,000/QALY or < £13,000/QALY) if QOF incentive payments (see Fig.
2) were respectively 32% or 64% lower than our base-case estimate, assuming that the mortality impact of QOF remained unchanged. Our analysis gives policymakers cost-effectiveness information for all joint scenarios of QOF payment and mortality benefit estimates to facilitate decision-making around lower QOF payments considering potential reductions in QOF benefits that would follow from these decisions. Our results were not sensitive to the assumption of concentrating the QOF benefits in those with cardiovascular disease, as shown with an ICER > £30,000/QALY, even if assuming the QOF equally benefited all individuals aged 0–74 years with and without disease.
In general, cost-effectiveness analyses of pay-for-performance policies, especially those that use QALYs as the effectiveness measure (i.e. cost–utility analyses), are rare. In a systematic review of economic evaluations of pay-for-performance policies, Emmert and co-authors found only one cost–utility analysis [
10]. That study, by Nahra and co-authors, modelled how hospital process improvements in heart-related care could be used to estimate QALYs via improved medication compliance and found that the pay-for-performance policy was cost- effective [
13]. Since the publication of the review by Emmert et al., Meacock et al. and Walker et al. performed cost–utility analyses of pay-for-performance policies. Meacock and colleagues evaluated the cost-effectiveness of a pay-for-performance scheme for hospitals in the UK (the Advancing Quality programme) using reductions in 30-day mortality (among patients admitted for pneumonia, heart failure or acute myocardial infarction) estimated from a difference-in-difference study and found that the programme was cost-effective using a threshold of £20,000/QALY [
11]. Walker and co-authors used previously published literature to estimate the potential cost-effectiveness of the QOF. They found that, for most QOF indicators studied, a less than 1% improvement would be needed for the programme to be cost-effective (using the £20,000–30,000/QALY threshold range). However, this study used estimates from randomised controlled trials to extrapolate the hypothetical effects of incentivising individual activities for which evidence on effectiveness was available, rather than estimating the impact of the overall programme [
9].
Our study conflicts with previous attempts to estimate the cost-effectiveness of pay-for-performance policies. Only the Meacock et al. study used effectiveness estimates (reduction in mortality that was translated to QALYs gained) that were directly measured, as opposed to extrapolating intermediate outcomes (such as improvements in medication compliance) to QALYs. The Meacock cost-effectiveness study was based on a difference-in-difference analysis that found no significant impact of a hospital-based pay-for-performance programme on outcomes for two of the incentivised conditions (acute myocardial infarction and heart failure) and a modest improvement for the third (pneumonia) after 18 months [
36]. However, this improvement was not sustained and re-analysis of the original data using a synthetic control approach found that the initial improvement for pneumonia was not statistically significant [
37,
38]. Unlike the Meacock et al. study, we were not able directly to measure and thus include the administrative costs of running the pay-for-performance programme. Our study is the first to evaluate the cost-effectiveness of the QOF using direct estimates of mortality (as opposed to intermediate outcomes or hospital-based pay-for-performance policies).
Our study has several limitations, including four related to data limitations. First, we could not estimate administrative costs due to data limitations. If administrative cost data become available, these costs could be added to the annual incentive costs in our sensitivity analysis around costs (the horizontal axis in Fig.
2) to estimate the impact of these costs on the cost-effectiveness results. The addition of these administrative costs would increase the ICER for continuing the QOF. Second, although our model cohort was simulated until death, we restricted QOF mortality effects up to age 74 years given the source data [
6]. Third, we also had limited information on the effect of the QOF on costs, such as those of additional visits to practices, referrals to secondary care and medication prescriptions. For instance, there are very few studies estimating the impact of the QOF on health-care utilisation, which is why we relied on a 2008 observational study from Scotland to estimate incentivised drug costs despite our model assuming a causal relationship between the QOF incentives and increased utilisation. To address this, we performed a sensitivity analysis around this input value, which showed that cost savings because of the QOF (i.e. averted utilisation costs outweighing the incentive costs) would be necessary to make the QOF cost-effective using a threshold of £30,000/QALY. Fourth, we also had incomplete information about the effects of the QOF on non-fatal outcomes, such as acute myocardial infarction and stroke. To address this, we varied our estimates across a range of assumptions about the effects of the QOF non-fatal outcomes and found that the ratio of fatal-to-non-fatal events would need to be more than doubled from our base-case estimate to make the QOF cost-effective. Fifth, our results from a probabilistic sensitivity analysis showed that there is considerable statistical uncertainty around the cost-effectiveness of continuing the QOF, suggesting that more precise estimates of the effect of the QOF on mortality could reduce the uncertainty around the decision of whether to continue the QOF. We also varied only one parameter (the effectiveness of the QOF) in the probabilistic sensitivity analysis because it was the only input with a well-estimated 95% confidence interval. Adding other parameters to the probabilistic sensitivity analysis could produce a higher uncertainty in our cost-effectiveness results, but that would not change the overall conclusion of our analysis. Sixth, QOF is subject to annual review and amendment [
6‐
8,
24,
26], and our main estimates for the effectiveness and incentives costs were based on the first 7 years of the programme. Therefore, our analysis is evaluating the decision to continue with incentives contained in QOF from 2004 to 2010 versus discontinuing these incentives. We have not evaluated the most recent versions of the QOF, for which we do not have linked cost and effectiveness data, but incentives for the conditions of interest were retained in these versions.
Despite these limitations, our findings imply that the UK should redesign the QOF or pursue alternative interventions to improve population health efficiently. The QOF is already in transition. The programme was reduced in scope in 2014, with 40 indicators retired to focus on a set of 83 key indicators [
7,
39]. In Scotland, the QOF was withdrawn altogether in 2016, with practices continuing to receive payments based on their historical performance without any further need to meet QOF targets. In the future, quality improvement in Scottish practices will be managed by local peer support networks and will rely on clinical governance arrangements rather than financial incentives [
40]. NHS England is also now seeking to develop a successor to the QOF [
41]. To enable informed decisions on further redesign or replacement of the QOF, future research should compare the cost-effectiveness of the programme with alternative system-level interventions (whether these programmes are different forms of financial incentives for providers or patients, or use other mechanisms to improve quality, costs or access to care). Similar research should be undertaken in other settings where pay-for-performance has been implemented, comparing its cost-effectiveness with other health system-level policies such as value-based insurance design [
42], computerised decision support interventions [
43] or value-based outcome reporting tools [
44]. These future analyses would provide crucial information about whether pay-for-performance in primary care is a cost-effective way to improve population health.