Definition of a high-volume hospital
In this study, we showed that being treated in a higher volume hospital increased the PFS of patients, compared to a lower volume hospital. More specifically, the probability of relapse (including death) was twice as high for patients treated in lower volume hospitals (i.e. 1.94 higher,
p < 0.001) compared to patients treated in higher volume hospitals). Indeed, the median PFS in high-volume hospitals was 20 months, versus only 14.2 months in low-volume hospitals. Moreover, the higher proportion of complete tumor resections, and the lower proportion of reoperation (Table
1) support the notion that the quality of the first-line surgery appears to be better in high-volume hospitals, as reported previously by Ioka et al. [
9] and Vernooij et al. [
13].
To define a high-volume hospital, different countries have employed different thresholds that are based on the prevalence of the disease [
4‐
14]. For example, the mean volume of activity of high-volume hospitals in the study by Ioka et al. on a Japanese dataset was 8.8 patients, which may be considered to be low compared to what has been seen with studies in the USA [
9]. Yet it appears that in 2012, 93% of the hospitals had treated fewer than 12 patients in first-line treatment for EOC per year in the Rhone-Alps region of France, 82% had treated fewer than 8, and 60% had treated fewer than 5. We chose the upper quartile (12 patients) in the main analysis as the threshold, in order to obtain a share of 25% of patients treated in a HVH that is more in line with the threshold of 20 cases that is widely used in the USA, which yielded a distribution of 17.9% of patients treated in HVH in the study by Bristow et al. [
6]. We also considered two other thresholds, namely 5 and 8 patients per year, as a sensitivity analysis in order to cover all of the quartiles of the patient distribution. The sensitivity analysis showed that the results were mixed when we considered a threshold of 8 cases/year, and that there was no longer a volume-outcome effect with a threshold of 5 cases/year. Indeed, with a threshold of 8 cases/year, the multivariate analysis revealed a positive impact of hospital volume activities on outcomes, whereas the propensity score analysis revealed no association at a 5% level of significance. Thus, the sensitivity analysis showed that the cut-off has to be restrictive enough in order to identify a volume outcome relationship for EOC.
Many countries already require a minimum level of activity for a hospital in order for it to be authorized to provide cancer treatments. In France, the minimum cut-off in order to receive authorization to treat gynecological cancers was defined by the French ministerial order of 27 March 2007 as 20 surgeries per year. Below this volume of activity, a hospital is no longer authorized to treat patients with gynecological cancers. This threshold, however, takes into accounts all of the various types of gynecologic cancers, such as cervical, ovarian, vaginal, uterine, and vulvar cancers. Our findings indicate that there is a need for a specific minimum activity cut-off for ovarian cancer only. Indeed, the overall threshold of 20 cases per year does not specify whether it refers to all gynecological cancers or ovarian cancer only. Out of all of the patients in first-line treatment for EOC in the Rhone-Alpes Region of France in 2012, 71% were treated in hospitals with fewer than 12 cases per year, 50% in hospitals with fewer than 8 cases per year, and 24% in hospitals with fewer than 5 cases per year. This distribution of hospital volume activities is not a specificity of the Rhone-Alpes region in France. Indeed, the public website
2 held by the French National Authority of Health (HAS) recorded that in the most populous region of France (i.e. Ile-de-France), 118 hospitals had authorization to treat gynecologic cancers in 2017, compared with 71 for the Rhone-Alpes region. With a population of 6,574,708 for the Rhone-Alpes region and of 12,142,802 for Ile-de-France in 2016 (source: National Institute of Statistical and Economic Information), there was one hospital treating gynecologic cancers for every 92,601 residents in the Rhone-Alpes region and one for every 102,905 residents in Ile-de-France. As the number of hospitals is similar between the two regions, the distribution of hospital volume activities is also likely to be similar.
Our findings appear to support the use of a specific cut-off for ovarian cancer, and more research needs to be done for other rare cancers in order to verify whether a specific minimum activity cut-off is similarly required. Nevertheless, a threshold at the hospital level does not take into account the heterogeneity among the practitioners at any given hospital. A recent study has shown that the physician’s volume of activity also positively correlates with survival, and that the combination of being treated in a high-volume hospital by a high-volume physician appears to be superior in terms of survival compared with other combinations of hospital and physician volumes of activity [
6]. More research needs to be done to develop a management program that takes into account the volume of activity at both the hospital and the physician level. Hospital participation in clinical trials has also been shown to improve EOC patient outcomes [
24]. More research need to be done to properly understand what underlies the volume-outcome relationship.
Why should we use a counterfactual approach?
We used observational data, which allowed for a better external validity than randomized controlled trials (RCT) [
17]. However, in this context of observational data, which is often the case in retrospective studies analyzing the care pathway, the selection bias due to the sample heterogeneity must be taken into account [
17]. Indeed, a selection bias, or recruitment bias, could appear since participation in the treatment was not random - some types of patients had a higher probability of being treated than others. Several well-known methods can be used to correct for this issue, such as stratification or multivariate analysis, and more sophistical methods are increasingly being used, such as matching using the propensity score or instrumental variable [
17].
In our case, patients treated in high- versus low-volume hospitals were not similar (Table
1). Thus, we expected selection bias to occur, which means that some types of patients were more likely to be treated in a high-volume hospital than others.
The propensity score approach is based on less constrained assumptions than multivariate analysis [
25,
26]. Indeed, propensity scores and multivariate analysis are based on the conditional independence assumption (CIA), which specifies that, conditional on observed covariates, patients were randomly treated in a high- or low-volume hospital. Based on the covariates recorded in our database, the CIA hypothesis assumes that two patients with the same age, cancer history, presence or not of ascites, histology, FIGO stage, neoadjuvant chemotherapy, and tumor grade will have similar outcomes (i.e. survival). However, multivariate analysis requires a stronger assumption about the distribution of the covariates and their relationship with relapse-free survival. In our case, we also had to choose a distribution of the hazard in order to fit a parametric AFT model of the relapse-free survival on a variable denoting treatment and on a set of covariates because the proportional hazard assumption was violated.
Therefore, the combination of a multivariate analysis and a matching method allowed us to determine both conditional and marginal effects of being treated in a high-volume hospital, and to prove the robustness of our findings. The conditional effect indicates that if a patient treated in a lower volume hospital was treated in a higher volume hospital, this would, on average, improve her progression-free survival (
p < 0.001). Furthermore, the marginal treatment effect indicates that patients treated in higher volume hospitals had a probability of relapse (including death) that was nearly half that for patients treated in lower volume hospitals (1.94-fold difference,
p < 0.001), and that the absolute difference in survival was significant (
p < 0.001) (see Fig.
2). We have reason to be confident of the robustness of our result since both the parametric (AFT model) and the semi-parametric (propensity score) approach yielded similar results.
With both methods, the type of chemotherapy was included as an indicator denoting one if the patient received a neoadjuvant chemotherapy; without differentiating for the use of neoadjuvant alone, in combination with adjuvant chemotherapy, the use of adjuvant chemotherapy alone, or no chemotherapy at all because this study sought to measure the impact of being treated in a HVH in first-line treatment. Adjuvant chemotherapy is not a first-line treatment, however, and could hence not be included as a prognostic factor. Neoadjuvant chemotherapy has been shown to decrease the Overall Survival (OS), meaning that it is linked to observed and unobserved patient characteristics that worsen outcomes [
27]. Thus, by controlling for it as a prognostic factor, we indirectly controlled for these observed and unobserved characteristics.
In the multivariate analysis, we used an AFT model instead of a semi-parametric Cox regression due to the non-proportionality of the hazard. We used the IPW matching as it was the method that best fit our data. Indeed, the IPW was the method with the lowest mean and median for the standardized difference of the mean, which indicates that this was the matching method that best balanced out the covariates between high- versus low-volume hospitals. Moreover, two simulation studies had shown that the IPW appears to perform better in determining the marginal hazard ratio of the treatment effect, compared with other matching methods [
25,
28]. It should be noted that the common support of the distribution of the propensity score is sufficient [see Additional file
2] to validate the overlap assumption. The mean standardized difference in the mean before matching was 20.4 versus 7.3 after matching using the IPW, which reveals a high quality of adjustment for the IPW matching. To our knowledge, this is the first study to use a propensity score approach in regard to the question of the concentration of care in ovarian cancer, while these methods have been widely used with other diseases [
29,
30].