nach oben

BMC Health Services Research

Erschienen in:

Open Access 01.12.2022 | Research

The contribution of benchmarking to quality improvement in healthcare. A systematic literature review

verfasst von: Claire Willmington, Paolo Belardi, Anna Maria Murante, Milena Vainieri

Erschienen in: BMC Health Services Research | Ausgabe 1/2022

Abstract

Background

Benchmarking has been recognised as a valuable method to help identify strengths and weaknesses at all levels of the healthcare system. Despite a growing interest in the practice and study of benchmarking, its contribution to quality of care have not been well elucidated. As such, we conducted a systematic literature review with the aim of synthesizing the evidence regarding the relationship between benchmarking and quality improvement. We also sought to provide evidence on the associated strategies that can be used to further stimulate quality improvement.

Methods

We searched three databases (PubMed, Web of Science and Scopus) for articles studying the impact of benchmarking on quality of care (processes and outcomes). Following assessment of the articles for inclusion, we conducted data analysis, quality assessment and critical synthesis according to the PRISMA guidelines for systematic literature review.

Results

A total of 17 articles were identified. All studies reported a positive association between the use of benchmarking and quality improvement in terms of processes (N = 10), outcomes (N = 13) or both (N = 7). In the majority of studies (N = 12), at least one intervention, complementary to benchmarking, was undertaken to stimulate quality improvement. The interventions ranged from meetings between participants to quality improvement plans and financial incentives. A combination of multiple interventions was present in over half of the studies (N = 10).

Conclusions

The results generated from this review suggest that the practice of benchmarking in healthcare is a growing field, and more research is needed to better understand its effects on quality improvement. Furthermore, our findings indicate that benchmarking may stimulate quality improvement, and that interventions, complementary to benchmarking, seem to reinforce this improvement. Although this study points towards the benefit of combining performance measurement with interventions in terms of quality, future research should further analyse the impact of these interventions individually.

Additional file 1.

Additional file 2.

Additional file 3.

Additional file 4.

Supplementary Information

The online version contains supplementary material available at https://doi.org/10.1186/s12913-022-07467-8.

Claire Willmington and Paolo Belardi contributed equally to this work.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Background

Introduced in the late 70s as an effort to reduce production costs in the manufacturing sector, benchmarking has since then been used as a method for continuous quality improvement in many different sectors and fields [1]. Although international literature has provided several definitions and taxonomies of benchmarking [2‐6], all of them share a common theme, defined as a “continuous process of measuring products, services and practices against the toughest competitors or those companies recognized as industry leaders” [2].

Starting from the 1990s, benchmarking has been applied to the healthcare sector with the aim of measuring and comparing clinical outcomes across organizations as well as enabling them to learn from one another and apply best practices [1, 7]. Benchmarking has become a structured method in the United States and the United Kingdom with the end goal of comparing hospital outcomes for cost-containment purposes [8], although comparison of outcome indicators dates back to the seventeenth century. The increased use of benchmarking was influenced by different factors, including the need to identify and better understand differences in healthcare practices and outcomes between and within different geographical areas [9]. If properly used, benchmarking may also provide a mechanism to detect unwarranted variation and promote the reduction of such [10, 11].

Nowadays, benchmarking represents one of the strategies used for quality improvement, that is, «the changes that will lead to better patient outcomes (health), better system performance (care) and better professional development» [12]. When benchmarking is used to this end, it includes a series of steps such as: identification of best performers through data analysis as well as in-depth (qualitative) investigation of factors that support the observed performance and quality improvement. Performance indicators allow for the conversion of quality to quantifiable metrics that can provide simplified information about a larger area of interest and facilitate comparison across organizations [13, 14]. Depending on the context, the indicators reporting benchmarking data can be aimed at different users with varying decision-making capabilities, ranging from patients to clinicians and policy makers [1, 15]. For instance, comparative performance data of certain clinical processes may lead clinicians to engage in different quality improvement activities such as audit & feedback strategies as well as professional development programs, whereas governments and regional authorities may choose to set policies based on the reporting of certain outcomes [15‐17]. Thus, it is crucial that performance indicators convey the right type of information to the right stakeholders. Another key element that contributes to the success of benchmarking is the development of reliable and valid performance indicators that are fit for use [13, 17]. This, however, remains a challenge, especially when it comes to cross-national comparisons as countries may differ in coding and methodologies they use to calculate indicators [14, 18]. Additionally, collaboration between benchmarking participants has also been shown to be a key factor contributing to the successful implementation and use of benchmarking in the healthcare sector [19, 20].

A number of reviews provided evidence that combining benchmarking with public reporting had a limited to moderate effect on quality improvement [21, 22]. However, public comparisons of performance of individuals or organizations could lead to controversy as poorer performers may be discouraged to improve if they feel their reputation has been damaged (e.g. “naming and shaming”) [23‐25]. On the other hand, public reporting of performance can also be used to stimulate quality improvement if top performance is emphasized (e.g. “naming and faming”) [26].

What emerges from the existing literature is that there is a continuous and growing interest in the systematic assessment and practice of benchmarking undertaken by healthcare systems and international agencies [13, 27‐29]. However, the contribution of benchmarking to quality of care has not been studied extensively.

To investigate this further, we conducted a systematic literature review with the aim of answering to the following research questions:

RQ1: Is there a relationship between the use of benchmarking and quality improvement in healthcare?

RQ2: Can benchmarking combined with additional strategies (e.g. meetings among participants, audit and feedback, use of incentives) further stimulate quality improvement?

Methods

A systematic literature review was conducted according to the Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [30].

Search strategy

To identify articles, we searched the following three databases, PubMed, Web of Science and Scopus. Search terms and keywords were defined according to the current literature on benchmarking. We reported in Additional file 1 the search strategies used for each of the databases along with the number of studies found.

The three databases were searched in January 2021, from their inception date to December 2020. The screening of articles followed a two-step process including: i) screening of titles and abstracts and ii) full text reading. Additionally, the reference lists of relevant articles were scanned to overcome the lack of database search generated articles containing the defined keywords in their title or abstract text.

A quality appraisal of the eligible articles was performed using the quality assessment tool (QATSDD) developed by Sirriyeh et al. for reviewing studies with diverse designs [31] (see Additional file 4). Additionally, we summarized the methodological strengths and weaknesses in the results section.

Study selection

Our search was restricted to peer-reviewed articles published in the English language. Inclusion and exclusion criteria were defined a priori. Articles were considered eligible if they empirically assessed the relationship between benchmarking and clinical outcomes as well as processes across at least two entities over time. We considered healthcare entities at all scales of benchmarking analysis: international, national, and regional level.

While we excluded articles that only focused on the direct impact of public reporting on performance, we considered articles in which benchmarking results were publicly available. Furthermore, we included articles in which the benchmarking participants were the sole decision-makers and users of the benchmarking results. As such, we excluded articles where the decision-making was external to the benchmarking participants, as it is the case for value-based programs in the US or consumers making informed choices. Additionally, we excluded studies that estimated the potential effects of benchmarking on quality through prediction models and those in which the relationship between benchmarking and performance was considered too indirect. We also excluded articles which did not assess performance over time. Finally, we excluded conceptual and theoretical articles as well as review articles, although we did not apply a filter concerning the study design (qualitative versus quantitative) or methodological approach as mixed-methods bring valuable contribution to this research field.

Two reviewers (PB and CW) independently screened titles and abstracts for relevance (see step I in search strategy subsection). Once potentially eligible articles were identified, all four authors independently screened full-text articles for inclusion. Any disagreement between reviewers was resolved through internal discussion and until consensus was reached. Additionally, it is worth noting that the heterogeneity of the studies in terms of methodology, clinical areas and study design was taken into consideration during the undertaking of this systematic literature review.

Data extraction and analysis

Using a data-charting tool (see Additional file 2 for the list of the variables included), we extracted the following information from the articles: authors; title; year; reported impact of benchmarking; type of quality improvement activity; country; data related to the benchmarking initiative (scale, participation, development, communication and indicators); study design; research question and findings. The data-charting tool was designed collectively as well as piloted by all four investigators (CW, PB, AMM, MV). We performed additional searches using authors sources or institutional webpages when information concerning the benchmarking initiative was missing or not specified in the article directly.

Following Donabedian’s definition of quality [32], we classified the results by process and outcome domains. Due to the high level of heterogeneity between studies in terms of outcomes and methodological designs, we were unable to perform a meta-analysis. However, we provided a synthesis of the resulting evidence.

Results

Literature search

As shown in Fig. 1, the literature search across the three databases identified 5935 articles. An additional 12 articles, identified through scanning of the articles’ references were integrated with the articles identified during the screening of titles and abstracts. Therefore, a total of 5947 articles were identified. The removal of duplicates (N = 999) narrowed down the number of articles to 4948. After applying the inclusion and exclusion criteria, a further 4879 articles were excluded from the second round of screening, thus resulting in 69 articles eligible for assessment. Finally, the full-text screening led to the exclusion of 52 articles, reasons being that they either did not meet the inclusion criteria previously defined in the methods section (see subsection “study selection”) or their full texts were unavailable. As such, a total of 17 articles were finally considered for qualitative assessment and synthesis [33‐49].

Study characteristics and benchmarking approaches

Table 1 illustrates the characteristics of the 17 studies. These were published in academic journals between 2004 and 2020 and all benchmarking initiatives were implemented in either North America, Europe or Japan. Thus, all analysed studies took place in high-income countries, as classified by the World Bank [50].

Table 1

Description of the studies used in this paper

#	First author; Year	Clinical area	Effect on quality process	Effect on patient outcomes	Type of actions	Benchmarking participants (n)	Units analysed (n)	Benchmarking dimension	Reporting frequency	Number of indicators analysed in the paper	Communication of performance results	Study design
1	Cronenwett et al. 2007 [33]	Surgery - cardiovascular	Preoperative medication use: B-blocker increased from 72 to 91%; preoperative aspirin or clopidogrel from 73 to 83% and preoperative statin from 54 to 72%.	Not evaluated	Biannual meetings attended by different stakeholders (e.g. surgeons, data collection personnel, researchers, and hospital administrators). Participants received instruction in continuous quality improvement techniques and applied these principles to preoperative B-blocker usage.	Hospitals (n = 9)	Surgical operations (n = 6143)	Quality	Continuous	3	Internal purposes	Observational
2	Campion et al. 2011 [34]	Oncology - Palliative	Higher performance for recurring participants on 9 indicators related to the assessment of pain and dyspnea as well as hospice care.	Not specified	Not specified.	Clinics (n = 178)	Clinics (n = 178)	Quality	Continuous	15	Internal purposes	Observational
3	Stern et al. 2011 [35]	Cystic fibrosis	Not specified.	Centres improved on indicators related to patient weight and lung capacity over a three-year period.	Best centres asked to define their strategies and share them to feed a learning processes/quality improvement. Open internal discussions. Plan-do-check-act (PDCA) cycles.	Cystic fibrosis centers (n = 12)	Cystic fibrosis centers (n = 12)	Quality	Continuous	3	Public disclosure	Observational
4	Hermans et al. 2013 [36]	Diabetes	No significant change.	Higher proportion of patients in the benchmarking group reached clinical targets than in the control group over a 12-month follow-up period.	Not specified.	Primary care physicians (n = 477)	Primary care physicians (n = 477)	Quality	Not continuous	4	Not reported	RCT
5	Merle et al. 2009 [37]	Hip replacement	Indicators related to clinical processes (e.g. time between discharge from orthopedic ward and completion of orthopedic hospitalization record) improved.	Lower percentage of readmissions to acute care in all participating hospitals. Lower percentage of pts. with pressure sores in one hospital. Time to surgery improved in single hospitals.	Review/discussion of comparative performance results by the teams followed by implementation of quality improvement as deemed necessary by each team: improving nutritional status, shorten delays, improving communication btwn professionals.	Hospitals (n = 3)	Hospitals (n = 3)	Quality, appropriateness and patient safety	Not continuous	15	Internal purposes	Interventional
6	Hall et al. 2009 [38]	Surgery - general	Not specified.	Improvement of both mortality and complication rates across participating hospitals.	Best practices guidelines; case studies of hospitals improving; and rapid data feedback for monitoring progress were provided to participating hospitals.	Hospitals (n = 187)	Hospitals (n = 187)	Quality, appropriateness and patient safety	Continuous	2	Internal purposes	Observational
7	Tepas III. et al. 2014 [39]	Surgery - general	Not specified.	Reduction of postoperative complications (14.5%): lower incidences of catheter-associated urinary tract infections, surgical site infections, and adverse events after colorectal surgeries in patients over 65.	Monthly participant conference calls.	Hospitals (n = 54)	Surgical operations (n = 38,896)	Patient safety	Non continuous	4	Internal purposes	Observational
8	Nuti et al. 2016 [40]	Multiple	Performance improvement on composite indicator (encompassing hospital, primary and preventive care) in 11 out of 21 regions.	Not evaluated.	Strategic planning and goal setting of health authorities involved. P4P schemes for heads of health authorities. Communication and discussion of results among different stakeholders including managers, clinicians and patients.	Regional healthcare systems (n = 21)	Regional healthcare systems (n = 21)	Population health, regional strategy compliance, quality, patient satisfaction, staff satisfaction, efficiency	Continuous	14	Public disclosure	Observational
9	Govaert et al. 2016 [41]	Oncology_colorectal cancer	Not specified.	Severe complication rate and mortality rate declined by 20 and 29% respectively. Length of hospital stay declined by 13%.	Not specified.	Hospitals (n = 29)	Patients (n = 9913)	Quality, appropriateness and patient safety	Continuous	4	Internal purposes	Observational
10	Piccoliori et al. 2020 [42]	Primary care	Improvement on indicators related to documentation of patient charatersitics, diagnostic tests and prescription of anticoagulants.	Improvement over 1 to 2 years follow-up: Lower percentage of patients with lower blood pressure; Higher number of diabetic patients with HbA1c < 7.0%; Higher percentage of patients with lower LDL-cholesterol.	Self-audit. Technical support provided to participants. Quality circles conducted twice a year to discuss results and strategies for improvement.	General practitioners (n = 36)	General practitioners (n = 36)	Quality	Not continuous	91	Not reported	Interventional
11	Qvist et al. 2004 [43]	Multiple	Improvement on indicators related to documentation of patient charatersitics, planning of clinical pathway, medication and information provision to patients.	No significant changes.	Conference held btwn two audit rounds. Wards with highest performance gave presentationson local processes of care. Quality improvment projects.	Hospitals (n = 47)	Hospitals (n = 47)	Quality	Not continuous	10	Internal purposes	Observational
12	Nuti et al. 2013 [44]	Multiple	More than 50% of the indicators significantly improved their yearly performance over the 4-year period.	More than 50% of the indicators significantly improved their yearly performance over the 4-year period, including the percentage of femur fractures operated within 2 days.	Linkage between preformance on indicators and CEO’s reward system. Regular meetings between different stakeholders, including managers and clinicians.	Regional local health authorities (n = 12) and teaching hospitals (n = 4)	Regional local health authorities (n = 12) and teaching hospitals (n = 4)	Population health, capacity to pursue regional strategies, clinical performance, patient statisfaction, staff satisfaction, effiency	Continuous	130	Public disclosure	Observational
13	Van Leersum et al. 2013 [45]	Oncology_colorectal cancer	Increase in % of patients discussed in a pre-operative meetings. Improvement inù the implementation of recommended guidelines on preoperative MR-imaging for rectal cance. Improved standard of pathological reporting.	Postoperative morbidity, length of hospital stay and postoperative mortality decreased significantly. The re-intervention rate decreased.	Not specified.	Hospitals (n = 92)	Patients (n = 24,828)	Quality, appropriateness and patient safety	Continuous	10	Public disclosure	Observational
14	Margeirsdottir et al. 2010 [46]	Diabetes	Use of intensive insulin treatment and pumps increased.	The mean HbA1c of all clinics improved.	Quality meetings and discussions. Provision of clinical guidelines to participating teams at the start of the study.	Clinics (n = 25)	Patients (n = 5599)	Quality and appropriateness	Continuous	7	Internal purposes	Observational
15	Kodeda et al. 2015 [47]	Oncology_colorectal cancer	Preoperative radiotherapy and chemoradiotherapy became more common. Number of multidisciplinary team conferences increased. Indicators related to specific surgical procedures improved.	Postoperative mortality after 30 and 90 days decreased. 5-year local recurrence rate dropped. Proportion of non-operated patients increased.	Regional and national meetings where points and specific findings are presented and discussed by representatives from all hospitals.	All hospitals in Sweden	Patients (n = 29,925)	Quality, appropriateness and patient safety	Continuous	22	Public disclosure	Observational
16	Pinnarelli et al. 2011 [48]	Hip replacement	Not specified.	Proportion of hip operations performed within 48 h increased by 34% for Lazio and 46% for Tuscany.	Workshops for discussion and training organised among regional managers and professionals. Performance on indicators are linked with CEO’s compensation system/DRG reimbursement.	Hospitals in Lazio (n = 42) and Tuscany (n = 26)	Patients (n = 273,320)	Quality	Continuous	1	Public disclosure	Observational
17	Miyata et al. 2012 [49]	Surgery - cardiovascular	Not specified	Improvement of operative mortality and morbidity.	Not specified.	Hospitals (n = 99)	Isolated CABG procedures (n = 3882)	Quality and patient safety	Continuous	2	Internal purposes	Observational

We found that the studies included diverse clinical areas. Nevertheless, a number of studies can be grouped in similar clinical areas (see column “Clinical area” in Table 1), namely oncological care (N = 4), surgical care – general and cardiovascular (N = 5) - and chronic illeness care (N = 3).

In all but one benchmarking initiative, participation was voluntary as opposed to mandatory. Participants varied from individual clinicians to hospitals. In terms of granularity of the analyses (see column “units analysed” in Table 1), the level of data aggregation ranged from individual procedures and patients to hospitals and regional healthcare systems.

Figure 2 Panel A illustrates the distribution of the different scales at which benchmarking was carried out. Benchmarking activities were mostly conducted at a national level: either covering an entire territory or selected regions. Only one initiative was implemented at the international level.

As displayed on Fig. 2 Panel B, the benchmarking activities were developed and implemented by a wide variety of actors within the healthcare system. Most of them, however, were carried out by either academia or medical associations (N = 13, see studies number 1,2,3,5,6,7,8,9,10,12,13,14,16 in Table 1). Additionally, the majority of benchmarking initiatives (N = 11, see column “Reporting frequency in Table 1) monitored performance continuously over time.

With reference to our research objective, we found that all studies included in our analysis reported quality improvement both in terms of care process and outcomes.

Secondly, we found that the use of benchmarking was generally associated with various complementary quality improvement strategies, as illustrated in the following subsections. Finally, all the results reported evidence of a positive contribution of benchmarking, suggesting a bias in the literature.

Quality improvement in terms of processes and outcomes

Evaluation of performance on process indicators over time was conducted in over half of the studies. Almost all of these studies (N = 10) reported significant improvement on these measures. Table 1 shows that measures on medication were most commonly reported (N = 4, see studies number 1,10,11,14 in Table 1), followed by measures on documentation of patient’s health (N = 3, see studies number 5,10,11 in Table 1), diagnostic test (N = 2, see studies number 10,13 in Table 1) and multidisciplinary meetings (N = 2, see studies number 8,12 in Table 1). Medication measures included use of B-blockers, anticoagulants and insulin. Six studies did not evaluate care processes (see studies number 3,6,7,9,16,17 in Table 1). Evaluation of performance on process indicators over time was conducted in over half of the studies. Almost all of these studies (N = 10, see studies number 1,2,5,8,10,11,12,13,14,15 in Table 1) reported significant improvement on these measures.

14 studies assessed outcome measures over time. Apart from two, all of these studies reported significant improvement on outcome measures, which largely consisted of measures on mortality and post-surgery complications (N = 6, see studies number 5,6,7,9,13,15 in Table 1), followed by outcomes for diabetic patients, e.g.systolic blood pressure, cholesterol and HbA1c levels (N = 2, see studies number 4,14 in Table 1, hospital length of stay (N = 2, see studies number 9,12,13 in Table 1) and time to surgery (N = 2, see studies number 5,12 in Table 1). Four of the studies reported adjusted outcome measurements at patient level (age, risk).

Seven studies reported performance improvement on both process and outcome indicators. The study period outlined in all the articles varied from 6 months to 18 years.

Performance changes on process and outcomes indicators reported by each study are described in Table 1.

The methods used to improve quality can be classified into two categories: strategies that made direct use of results on performance indicators to actively stimulate performance improvement – audit & feedback, quality improvement plans, Plan-do-check-act (PDCA) cycles, financial incentives - and strategies that indirectly supported quality improvement such as meetings, provision of guidelines as well as technical support.

Table 1 shows that meetings among participants were the most frequently used strategy by benchmarking initiatives to support performance improvement (N = 11, see studies number 1,3,5,6,7,8,11,12,14,15,16 in Table 1), followed by quality improvement plans (N = 4, see studies number 1,3,8,11 in Table 1), pay-for performance schemes (N = 3, see studies number 8,12,16 in Table 1), provision of guidelines (N = 2, see studies number 6,14 in Table 1) and audit & feedback (N = 2, see studies number 6,10 in Table 1). A combination of at least two strategies were present in over half of the studies (N = 10, see studies number 1,3,5,6,8,10,11,12,14,16 in Table 1). This combination would most commonly include meetings or discussions and direct quality improvement plans (N = 5, see studies number 3,5,8,10,11 in Table 1). Additionally, meetings were used as a single strategy in two of the studies. Five studies, on the other hand, did not report any type of quality improvement strategy implemented (see studies number 2,4,9,13,17 in Table 1).

Methodological approaches for quality improvement measurement

To assess the change in quality linked to benchmarking, most of the studies included in this analysis considered time trends, starting from the beginning of performance reporting (see studies number 1, 3, 6–9, 11–16 in Table 2). Other studies, however, used different approaches, including comparing performance between initial participants and those that joined the benchmarking initiative later (see studies number 2, 17 in Table 2), as well as comparing performance of facilities before and after initiation of benchmarking (see studies number 5, 10 in Table 2). In one case, a control group was used to evaluate the change in performance of facilities that underwent benchmarking (see study number 4 in Table 2). While the articles varied in terms of study periods, ranging from 6 months to 18 years, performance, was on average, monitored over a period of 4 years. The longer the study period was, the more likely information bias was reduced. Seven studies were population-based (see studies number 8, 9, 12, 13, 14, 15, 16 in Table 2), which reduced selection bias in these cases. In certain studies, data was aggregated at the healthcare provider or regional level (see studies number 8, 11,12 in Table 2). Methods for counteracting selection bias and accounting for differences between patients as well as care settings were specified in almost all articles. In certain smaller-scale studies, data analysis was performed and reported for each facility involved, thus also accounting for potential differences between care settings (see studies number 3, 5 in Table 2). In cases where no form of risk-adjustment was performed, the analysis was often focused on process rather than outcome indicators (see studies number 1, 2 in Table 2). Additionally, in two instances, data validation was performed to address information bias (see studies number 6, 13 in Table 2). Aside from one study in which long-term survival was analysed (see study number 15 in Table 2), the majority reported short-term outcomes.

Table 2

Summary of methodological strengths and weaknesses

#	First author; Year	Length of follow-up time	Performance evaluation strategy	Patient population	Limitations	Control for biases
1	Cronenwett et al. 2007 [33]	3 years	Time trend	Clearly defined sample of patients undergoing vascular surgery	-Risk adjustment was not performed. -Only processes of care were evaluated.	None specified in the article.
2	Campion et al. 2011 [34]	4 years	Performance compared between initial and later participants	Sample of end-of-life cancer patients defined by age, sex and tumor type	-Risk adjustment was not performed. -Only processes of care were evaluated.	None specified in the article.
3	Stern et al. 2011 [35]	5 years	Time trend	Clearly defined sample of cystic fibrosis patients.	Limited number of care centers involved	-The performance of each center was analyzed separately -Analysis was age-adjusted for certain indicators
4	Hermans et al. 2013 [36]	1 year	RCT	Clearly defined sample of diabetic patients	-Short follow-up time. -Highly heterogeneous group of care settings involved	-Use of control group. -Differences between patients as well as care settings were accounted for in the analysis
5	Merle et al. 2009 [37]	6 months	Before/after comparison	Clearly defined sample of patients undergoing surgical care for hip fracture.	-Short follow-up time -Small number of hospitals involved. -No use of control group	Analysis performed for each hospital involved.
6	Hall et al. 2009 [38]	3 years	Time trend	Sample of patients undergoing general and vascular surgery	-Self selection of centers, thus the results may not be representative of the population. -The analysis is based on sampling.	Different modelling approaches were used to control for differences between patients.
7	Tepas III et al. 2014 [39]	15 months	Time trend	Sample of patients undergoing general and vascular surgery.	-Short follow-up period. -Little information on patient population.	Risk-adjustment was performed.
8	Nuti et al. 2016 [40]	5 years	Time trend	General population	-Highly aggregated data analysis (regional level) -Use of composite indicator that is based on 14 indicators.	-Population-based study -Data was standardized for age and sex
9	Govaert et al. 2016 [41]	3 years	Time trend	-Population-based -Clearly defined sample of patients undergoing surgery for colorectal cancer.	-Only short-term survival was considered.	-Population-based study -Risk-adjustment was performed to account for differences between patients. -External data validation performed
10	Piccoliori et al. 2020 [42]	3 years	Before/after comparison	Sample of patients with chronic conditions.	-Small-scale study -Results were not adjusted for differences between care providers or patients -Little information on patient population	-Information bias was diminished by removing prevalences from the analysis.
11	Qvist et al. 2004 [43]	1 year	Time trend	Few information on patients characteristics as the focus of the analysis is on the providers	-Short follow-up time period -No risk adjustment was performed.	None specified in the article.
12	Nuti et al. 2013 [44]	4 years	Time trend	General population	-Highly aggregated data analysis (regional level)	-Population-based study -Data was standardized for the population’s health needs
13	Van Leersum et al. 2013 [45]	2 years	Time trend	-Population-based -Clearly defined sample of patients undergoing surgery for colorectal cancer.	- Short follow-up time period	-Population-based study -The data was adjusted for differences between patients. -External data validation was performed
14	Margeirsdottir et al. 2010 [46]	5 years	Time trend	-Population-based -Clearly defined sample of pediatric patients with diabetes.	-No information on non-participants	-Population-based study -Adjustment for patient age and duration of disease was performed. -All measurements were standardized.
15	Kodeda et al. 2015 [47]	18 years	Time trend	-Population-based - Clearly defined sample of patients with colorectal cancer.	-Lack of external data validation -Absence of control group	-Population-based study -Longer follow-up time period.
16	Pinnarelli et al. 2011 [48]	3 years	Time trend	-Population-based - Clearly defined sample of patients undergoing surgical care for hip fracture.	-A number of confounders including patient co-morbidities could not be controlled for in the analysis.	-Population-based study -Risk-adjustment of performance was performed.
17	Miyata et al. 2012 [49]	4 years	Performance compared between initial and later participants	Clearly defined sample of patients undergoing coronary artery bypass graft (CABG)	-Limited number of participants involved	-Risk-adjustment of performance was performed

Discussion

Summary of main findings

This systematic literature review addresses our research questions by providing evidence concerning a positive association between the use of benchmarking and quality, which is further stimulated when combining benchmarking with specific interventions, such as meetings between participants, quality improvement plans and financial incentives.

The studies we analysed confirm that benchmarking is a useful tool which has yet to be systematically implemented at all levels of the healthcare system [1].

Most of the initiatives were voluntary based and had a bottom-up approach, involving mainly medical associations and academia. More specifically, our findings suggest that benchmarking data was in large part used at the micro level by speciality departments and hospitals, sometimes in the context of small-scale pilot studies that involved a small number of participants [35, 37, 42]. This raises questions regarding the involvement of high-level decision makers when it comes to the use of benchmarking. Importantly, the geographical scope of these studies was limited to Europe and North America.

Research on the practice of benchmarking

Healthcare systems worldwide are increasingly being called on to identify reliable methods for measuring quality of care [51, 52]. This is partly due to the increasing availability of data generated at all levels of the healthcare system. The practice of benchmarking and performance improvement has been considered, especially in Europe, a growing area of research which has received less attention than the identification of performance indicators that reliably benchmark information in different clinical areas [16].

Following the identification of indicators, the questions ensue as to which users they are intended for and the purpose of their use. The information needs of users may differ depending on their decision-making capacity when it comes to taking action based on benchmarking data. As such, the actionability of this type of evidence-based information remains debatable. Furthermore, certain studies [53, 54] have suggested that benchmarking data was generally underused by decision makers within the healthcare system. On the other hand, when healthcare providers do take into account benchmarking data, reluctance may arise when integrating this information into practice for changing behaviour and procedures [55]. The clinician’s subjective perception can also be a factor when deciding on which areas of performance to consider for improvement [37].

Benchmarking and quality improvement

All articles considered in this review reported performance improvement following communication of benchmarking data. One could argue, however, that the sustainability of the reported quality improvement could differ from one study to another depending on the length of follow-up time and monitoring of performance. For instance, in five of the articles, performance was monitored over a relatively short period of time, ranging from 6 months to 2 years [36, 37, 39, 42, 43]. Although these studies validate the use of benchmarking as a tool for quality improvement, researchers have argued that, in this case, performance improvement could be attributed to the experimental conditions under which benchmarking is taking place as well as the newness of the initiative itself, rather than a long-lasting impact of performance measurement [49, 56]. On the other hand, articles reporting a longer follow-up time have also shown sustained performance improvement [33, 38, 40, 41, 46]. Interestingly, only one article focused on the capacity of benchmarking to reduce geographical variation [11].

Furthermore, our results suggest that quality improvement was achieved not only by high performing organisations but also by those whose performance was initially suboptimal [38, 39]. It has long been speculated that the combination of continuous performance measurement with interventions, such as discussions of benchmarking results, was associated with long-lasting quality improvement [43, 46, 56]. The majority of articles from our results reported the implementation of these interventions in addition to benchmarking, ranging from meetings to quality improvement plans and audit & feedback. Meetings between benchmarking participants were the most frequently cited intervention by the articles. Although this type of intervention has a more supporting than active role in terms of quality improvement, interactions between benchmarking participants do facilitate direct exchange of experience and transfer of best practices, thus prompting organisations to further engage in activities adapted to their performance needs. Furthermore, our results showed that meetings were often combined with other interventions, such as quality improvement plans and financial incentives. For instance, Italy’s Tuscany region combines discussions of publicly reported benchmarking data between different stakeholders with pay for performance schemes for local decision-makers and clinicians [40, 44, 48]. Although many have recognized the positive effects that benchmarking and quality improvement activities have, some have argued that the extent of their impact on quality remains unclear, and as such, establishing a causal relationship between benchmarking and quality remains difficult [38, 43, 57].

The relationship between process and outcome indicators

Lastly, several articles included in our review suggest that performance improvement on process indicators is correlated with better outcomes as well, particularly in primary care and certain clinical areas such as diabetes and colorectal cancer [42, 44‐47]. This should come to no surprise as it is widely accepted that processes of care contribute in large part to patient outcomes [58, 59]. However, it has been argued that outcomes are reflective of a wide variety of determinants, some related to healthcare and others not. Furthermore, processes of care that are measurable may represent only a fraction of all the processes that contribute to a particular outcome [60]. However, given the ongoing transformation of performance management systems and the rise of innovative measures, including patient reported data, population based indicators and measures on resilience and sustainability [61], one could expect the relationship between processes and outcomes to change.

Limitations

This literature review included peer-reviewed studies in English, and excluded grey literature as well as foreign language journals. Furthermore, the results show a very limited number of studies on the relationship between benchmarking and quality improvement, despite the growing interest and research on this topic at the international level. Many articles focus on the practical actions to foster benchmarking as a tool to learn from excellence [62], set strategic planning [40, 63], and improve reputation by naming and faming and peer learning [26]. However, these articles provide specific frameworks on the use of benchmarking rather than report results and impacts of its application.

Another limitation relates to the robustness of the methods used as almost all articles are based on observational analysis and are thus susceptible to methodological biases.

Conclusions

The limited number of studies generated by this systematic literature review suggests that the contribution of benchmarking in healthcare needs to be further explored. Our findings also indicate that benchmarking may foster quality improvement, and that complementary interventions, such as meetings and audit & feedback, can also play a role in further reinforcing quality improvement.

As data becomes more widely available, it is becoming increasingly important for healthcare systems to identify reliable performance indicators that are adapted to the needs of different stakeholders, who ultimately, are the end-users of benchmarking information. As such, further research needs to be conducted as to discern the factors, including contextual elements, that could influence the uptake of benchmarking at all levels of the healthcare system. Although this study points towards the positive impact of combining performance measurement with interventions on quality, future research should analyse the individual impact of these interventions, including non traditional ones such as the promotion of good performance practices.

Acknowledgements

The authors want to acknowledge the members of the Healthcare Management Laboratory of the Scuola Superiore Sant’Anna.

Declarations

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Open AccessThis article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1.

Additional file 2.

Additional file 3.

Additional file 4.

Ettorchi - Tardy A, Levif M, Michel P. Benchmarking: A method for continuous quality improvement in health. Healthc Policy. 2012;7(4):101–19.

Camp RC. The search for industry best practices that lead to superior performance; 1989. p. 320.

Liebfried HJ, McNair CJ. In: Sons JW, editor. Benchmarking: a tool for continuous improvement. New York: Wiley; 1992.

Watson GH. Strategic benchmarking: How to rate your comoany’s performance against the world’s best. John Wiley & Sons Incorporated, editor. Somerset: Wiley; 1993.

Bowerman M, Francis G, Ball A, Fry J. The evolution of benchmarking in UK local authorities. Benchmarking An Int J. 2002;9(5):429–49.CrossRef

Doug M, Gift B. Collaborative Benchmarking in Healthcare. J Qual Improv. 1994;20:239–49.

Camp RC, Tweet AG. Benchmarking applied to health care. Jt Comm J Qual Improv. 1994;20(5):229–38.PubMed

Thonon F, Watson J, Saghatchian M. Benchmarking facilities providing care: an international overview of initiatives. SAGE Open Med. 2015;3:205031211560169.CrossRef

Wennberg JE. Understanding geographic variations in health care delivery. N Engl J Med. 1999;340(1):52–3.PubMedCrossRef

10.

Arah OA, Klazinga NS, Delnoij DMJ, Ten Asbroek AHA, Custers T. Conceptual frameworks for health systems performance: a quest for effectiveness, quality, and improvement. Int J Qual Heal Care. 2003;15(5):377–98.CrossRef

11.

Nuti S, Seghieri C. Is variation management included in regional healthcare governance systems? Some proposals from Italy. Health Policy (New York). 2014;114(1):71–8.CrossRef

12.

Batalden PB, Davidoff F. What is “quality improvement” and how can it transform healthcare? Qual Saf Heal Care. 2007;16(1):2–3.CrossRef

13.

Smith P, Mossialos E, Papanicolas I, Leatherman S. Performance measurement and professional improvement. Cambridge: Cambridge University Press; 2009. p. 613–40.

14.

Expert Group on Health Systems Performance Assessment. So what? - Strategies across Europe to assess quality of care: European Union; 2016. p. 92–107. Available from: http://europa.eu

15.

Oliver TR. Population health rankings as policy indicators and performance measures. Prev Chronic Dis. 2010;7(5):A101.

16.

Klazinga N, Fischer C, Ten Asbroek A. Health services research related to performance indicators and benchmarking in Europe. J Heal Serv Res Policy. 2011;16(SUPPL. 2):38–47.CrossRef

17.

Barbazza E, Klazinga NS, Kringos DS. Exploring the actionability of healthcare performance indicators for quality of care: a qualitative analysis of the literature, expert opinion and user experience. BMJ Quality & Safety 2021;30:1010–20.

18.

Nolte E. International benchmarking of healthcare quality: a review of the literature. Rand Heal Q. 2012;1(4):e1000097.

19.

Nuti S, Vainieri M. Strategies and tools to manage variation in regional governance systems. In: Handbook of Health Services Research. Boston: Springer Reference; 2014. p. 23.

20.

Codling S. In: Gower Publishing L, editor. Best practice benchmarking: a management guide. Aldershot: Gower Publishing, Ltd; 1995.

21.

Lober WB, Flowers JL. Consumer reports in health care: do they make a difference? Semin Oncol Nurs. 2011;27(3):169–82. Available from: https://doi.org/10.1016/j.soncn.2011.04.002.CrossRefPubMed

22.

Prang KH, Maritz R, Sabanovic H, Dunt D, Kelaher M. Mechanisms and impact of public reporting on physicians and hospitals’ performance: a systematic review (2000–2020). Plos One. 2021;16(2 February):1–24. Available from: https://doi.org/10.1371/journal.pone.0247297.CrossRef

23.

Hibbard JH, Stockard J, Tusler M. Does publicizing hospital performance stimulate quality improvement efforts? Health Aff. 2003;22(2):84–94.CrossRef

24.

Bevan G, Fasolo B. Models of governance of public services: Empirical and behavioural analysis of ‘econs’ and ‘humans’. In: Behavioural Public Policy. Cambridge: Cambridge University Press; 2013. p. 38–62.CrossRef

25.

Bevan G, Wilson D. Does “naming and shaming” work for schools and hospitals? Lessons from natural experiments following devolution in England and Wales. Public Money Manag. 2013;33(4):245–52.CrossRef

26.

Bevan G, Evans A, Nuti S. Reputations count: why benchmarking performance is improving health care across the world. Heal Econ Policy Law. 2019;14(2):141–61.CrossRef

27.

World Health Organization. Health systems : improving performance: World Health Organization; 2000. p. 215.

28.

WHO, OECD. DAC guidelines and reference series: poverty and health; 2003. p. 55. Available from: http://www.oecd-ilibrary.org/development/poverty-and-health_9789264100206-en

29.

Arah OA, Westert GP, Hurst J, Klazinga NS. A conceptual framework for the OECD health care quality indicators project. Int J Qual Heal Care. 2006;18(SUPPL. 1):5–13.CrossRef

30.

Moher D, Liberati A, Tetzlaff J, Altman DG, Altman D, Antes G, et al. Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement. Plos Med. 2009;6(7).

31.

Sirriyeh R, Lawton R, Gardner P, Armitage G. Reviewing studies with diverse designs: the development and evaluation of a new tool. J Eval Clin Pract. 2012;18(4):746–52.PubMedCrossRef

32.

Donabedian A. The Quality of Care: How Can It Be Assessed? JAMA. 1988;260(12):1743–8.PubMedCrossRef

33.

Cronenwett JL, Likosky DS, Russell MT, Eldrup-Jorgensen J, Stanley AC, Nolan BW. A regional registry for quality assurance and improvement: the vascular study Group of Northern new England (VSGNNE). J Vasc Surg. 2007;46(6):1093–103.PubMedCrossRef

34.

Campion FX, Larson LR, Kadlubek PJ, Earle CC, Neuss MN. Advancing performance measurement in oncology. Am J Manag Care. 2011;17(Suppl 5):31–5.

35.

Stern M, Niemann N, Wiedemann B, Wenzlaff P. Benchmarking improves quality in cystic fibrosis care: a pilot project involving 12 centres. Int J Qual Heal Care. 2011;23(3):349–56.CrossRef

36.

Hermans MP, Elisaf M, Michel G, Muls E, Nobels F, Vandenberghe H, et al. Benchmarking is associated with improvedquality of care in type 2 diabetes: the OPTIMISE randomized, controlled trial. Diabetes Care. 2013;36(11):3388–95.PubMedPubMedCentralCrossRef

37.

Merle V, Moret L, Pidhorz L, Dujardin F, Gouin F, Josset V, et al. Does comparison of performance lead to better care? A pilot observational study in patients admitted for hip fracture in three French public hospitals. Int J Qual Heal Care. 2009;21(5):321–9.CrossRef

38.

Hall BL, Hamilton BH, Richards K, Bilimoria KY, Cohen ME, Ko CY. Does surgical quality improve in the american college of surgeons national surgical quality improvement program: an evaluation of all participating hospitals. Ann Surg. 2009;250(3):363–74.PubMedCrossRef

39.

Tepas JJ, Kerwin AJ, Devilla J, Nussbaum MS. Macro vs micro level surgical quality improvement: a regional collaborative demonstrates the case for a national NSQIP initiative. J Am Coll Surg. 2014;218(4):599–604. Available from: https://doi.org/10.1016/j.jamcollsurg.2013.12.017.CrossRefPubMed

40.

Nuti S, Vola F, Bonini A, Vainieri M. Making governance work in the health care sector: evidence from a “natural experiment” in Italy. Heal Econ Policy Law. 2016;11(1):17–38.CrossRef

41.

Govaert JA, Van Dijk WA, Fiocco M, Scheffer AC, Gietelink L, Wouters MWJM, et al. Nationwide Outcomes Measurement in Colorectal Cancer Surgery: Improving Quality and Reducing Costs Presented at the European Society of Surgical Oncology 34th Congress, Liverpool, United Kingdom, October 2014. J Am Coll Surg. 2016;222(1):19–29.e2. Available from: https://doi.org/10.1016/j.jamcollsurg.2015.09.020.CrossRefPubMed

42.

Piccoliori G, Mahlknecht A, Abuzahra ME, Engl A, Breitenberger V, Vögele A, et al. Quality improvement in chronic care by self-audit, benchmarking and networking in general practices in South Tyrol, Italy: results from an interventional study. Fam Pract. 2021;38(3):253–8.PubMedCrossRef

43.

Qvist P, Rasmussen L, Bonnevie B, Gjørup T. Repeated measurements of generic indicators: a Danish national program to benchmark and improve quality of care. Int J Qual Heal Care. 2004;16(2):141–8.CrossRef

44.

Nuti S, Seghieri C, Vainieri M. Assessing the effectiveness of a performance evaluation system in the public health care sector: some novel evidence from the Tuscany region experience. J Manag Gov. 2013;17(1):59–69.CrossRef

45.

Van Leersum NJ, Snijders HS, Henneman D, Kolfschoten NE, Gooiker GA, Ten Berge MG, et al. The dutch surgical colorectal audit. Eur J Surg Oncol. 2013;39(10):1063–70. Available from: https://doi.org/10.1016/j.ejso.2013.05.008.CrossRefPubMed

46.

Margeirsdottir HD, Larsen JR, Kummernes SJ, Brunborg C, Dahl-Jørgensen K. The establishment of a new national network leads to quality improvement in childhood diabetes: implementation of the ISPAD guidelines. Pediatr Diabetes. 2010;11(2):88–95.PubMedCrossRef

47.

Kodeda K, Johansson R, Zar N, Birgisson H, Dahlberg M, Skullman S, et al. Time trends, improvements and national auditing of rectal cancer management over an 18-year period. Color Dis. 2015;17(9):O168–79.CrossRef

48.

Pinnarelli L, Nuti S, Sorge C, Davoli M, Fusco D, Agabiti N, et al. What drives hospital performance? The impact of comparative outcome evaluation of patients admitted for hip fracture in two Italian regions. BMJ Qual Saf. 2012;21(2):127–34.PubMedCrossRef

49.

Miyata H, Motomura N, Murakami A, Takamoto S. Effect of benchmarking projects on outcomes of coronary artery bypass graft surgery: challenges and prospects regarding the quality improvement initiative. J Thorac Cardiovasc Surg. 2012;143(6):1364–9. Available from: https://doi.org/10.1016/j.jtcvs.2011.07.010.CrossRefPubMed

50.

World Bank. World Bank country and lending groups [internet]. 2021. Available from: https://datahelpdesk.worldbank.org/knowledgebase/articles/906519-world-bank-country-and-lending-groups

51.

World Health Organization, World Bank Group O. Delivering quality health services: World Health Organization, World Bank Group, OECD; 2018. p. 1–100. Available from: http://apps.who.int/bookorders

52.

Kruk ME, Gage AD, Arsenault C, Jordan K, Leslie HH, Roder-DeWan S, et al. High-quality health systems in the sustainable development goals era: time for a revolution. Lancet Glob Heal. 2018;6(11):e1196–252.CrossRef

53.

Clarke A, Taylor-Phillips S, Swan J, Gkeredakis E, Mills P, Powell J, et al. Evidence-based commissioning in the English NHS: who uses which sources of evidence? A survey 2010/2011. BMJ Open. 2013;3(5):1–6.CrossRef

54.

Ivankovic D, Poldrugovac M, Garel P, Klazinga NS, Kringos DS. Why, what and how do European healthcare managers use performance data? Results of a survey and workshop among members of the European hospital and healthcare federation. Plos One. 2020;15(4):1–19. Available from: https://doi.org/10.1371/journal.pone.0231345.CrossRef

55.

De Lange DW, Dongelmans DA, De Keizer NF. Small steps beyond benchmarking. Rev Bras Ter Intensiva. 2017;29(2):128–30.PubMedPubMedCentralCrossRef

56.

Lied TR, Kazandjian VA. A Hawthorne strategy: implications for performance measurement and improvement. Clin Perform Qual Health Care. 1998;6(4):201–4.PubMed

57.

Braithwaite J, Yukihiro M, Johnson J. Healthcare reform, quality and safety: perspectives, participants, partnerships and prospects in 30 countries. Boca Raton: CRC Press; 2017.CrossRef

58.

Obit LJ. The measurement of health service outcomes. In: Oxford Textbook of Health Care; 1993.

59.

Goldstein, H., Spiegelhalter DJ. League Tables and Their Limitations : Statistical Issues in Comparisons of Institutional Performance Author ( s ): Harvey Goldstein and David J . Spiegelhalter Source : Journal of the Royal Statistical Society . Series A ( Statistics in Society ), Vol . 1. Society 2008;159(3):385–443.

60.

Lovaglio PG. Benchmarking strategies for measuring the quality of healthcare: Problems and prospects. Sci World J. 2012;2012(iii):606154.

61.

Vainieri M, Noto G, Ferre F, Rosella LC. A performance management system in healthcare for all seasons? Int J Environ Res Public Health. 2020;17(15):1–10.CrossRef

62.

Borghini A, Corazza I, Nuti S. Learning from excellence to improve healthcare services: the experience of the maternal and child care pathway. Int J Environ Res Public Health. 2021;18(4):1–10.CrossRef

63.

Vainieri M, Lungu DA, Nuti S. Insights on the effectiveness of reward schemes from 10-year longitudinal case studies in 2 Italian regions. Int J Health Plann Manag. 2018;33(2):e474–84.CrossRef

Titel: The contribution of benchmarking to quality improvement in healthcare. A systematic literature review
verfasst von: Claire Willmington
Paolo Belardi
Anna Maria Murante
Milena Vainieri
Publikationsdatum: 01.12.2022
Verlag: BioMed Central
Erschienen in: BMC Health Services Research / Ausgabe 1/2022
Elektronische ISSN: 1472-6963
DOI: https://doi.org/10.1186/s12913-022-07467-8

Springer Medizin

Abstract

Background

Methods

Results

Conclusions

Supplementary Information

Publisher’s Note

Background

Methods

Search strategy

Study selection

Data extraction and analysis

Results

Literature search

Study characteristics and benchmarking approaches

Quality improvement in terms of processes and outcomes

Quality improvement related actions

Methodological approaches for quality improvement measurement

Discussion

Summary of main findings

Research on the practice of benchmarking

Benchmarking and quality improvement

The relationship between process and outcome indicators

Limitations

Conclusions

Acknowledgements

Declarations

Ethics approval and consent to participate

Consent for publication

Competing interests

Publisher’s Note

Supplementary Information

Weitere Artikel der Ausgabe 1/2022

The effect of integrated health care in patients with hypertension and diabetes: a systematic review and meta-analysis

Willingness to pay for HPV vaccine among female health care workers in a Chinese nationwide survey

Social determinants of health and the prediction of missed breast imaging appointments

Understanding post-conflict mental health needs and co-producing a community-based mental health intervention for older adults in Colombia: a research protocol

Influence of the implementation of strengths-based nursing and healthcare on early childhood nurses’ competencies: a mixed-method study

Protecting, managing and bending boundaries: a biomedicalization perspective on Swedish youth clinics’ responses to mental (ill) health