Introduction
Cost-effectiveness analysis supports resource allocation decision making by comparing the differences in costs and effects of alternative treatment regimens [
1‐
4]. When such analyses are conducted alongside randomized controlled trials (RCTs), the cost-effectiveness of the evaluated treatments is generally expressed in terms of population averages. This provides insight into which of the available treatments performs best for the patient population considered. However, when these patients are characterized by a heterogeneous clinical condition, and their risk profiles are determined by factors like demographic variations, biometric variations, and co-morbidities, there may be considerable variation in response. In fact, the likelihood of subpopulations for whom response to one or the other treatment is obscured may be substantial [
5‐
8]. Such differences among patients may also lead to systematic variation in resource use and costs, which could be another reason why one of the other treatments performs better in specific subpopulations [
2,
6]. Acknowledging patient heterogeneity in health economic evaluation has therefore considerable potential in more efficient resource allocation decision-making [
5,
8‐
10].
A recently conducted systematic review [
5] identified baseline risk, treatment effect, health state utility, and resource utilization as the four input parameters of a health economic evaluation that may be prone to patient heterogeneity. However, as the cost-effectiveness of one treatment compared to another is ultimately determined by the net effect on all these parameters, it is essential that the impact of patient heterogeneity on each of these parameters is considered conjointly rather than in isolation, especially when the purpose is to identify more efficient reimbursement policies. For health economic evaluations conducted alongside an RCT, this can be achieved by conducting such analyses directly in terms of net monetary benefit (NMB) [
11‐
13].
Hoch et al. [
14] have previously proposed assessing the impact that different sources of patient heterogeneity may have on a treatment’s NMB by means of regression analysis. For example, suppose that one wants to explore whether the cost-effectiveness of a new treatment compared to the current standard treatment is affected by the age of the patient. Using regression analysis, this can be achieved by fitting a regression model with NMB as the dependent variable and the treatment indicator, age, and the interaction between age and the treatment indicator as the independent variables. A low
p value for the regression coefficient corresponding to the interaction term then shows that age has a relatively strong influence on the new treatment’s relative cost-effectiveness.
While the use of multivariable regression models may provide insight into which sources of patient heterogeneity potentially have an impact on the relative cost-effectiveness of the evaluated treatments, the statistical power to detect such interaction effects is usually low. Moreover, actually being able to verify relevant heterogeneity using such models strongly depends on whether the assumed multiplicative structure of interaction fits reality. This may lead to missing or over-interpretation of the detected significant interaction terms. An alternative approach for studying treatment–covariate interaction that makes no assumptions about the nature of the relationship between the outcome and the covariate in each treatment group is the Subpopulation Treatment Effect Pattern Plot (STEPP) methodology [
15‐
17]. This is based on a graphical exploration of the fluctuation in treatment effect across different, but overlapping subpopulations defined with respect to increasing levels of the covariate of interest. Although using STEPP to explore how the difference in NMB between two treatments varies as a function of one or more sources of patient heterogeneity could potentially be very useful in identifying more efficient reimbursement policies, to the best of our knowledge, it has not yet been considered. Using the difference in NMB as the measure of treatment benefit and an individualized predicted risk obtained from an RCT as the covariate of interest, the objective of this paper was to illustrate how the STEPP methodology can be used to derive risk-stratified treatment allocation strategies that maximize cost-effectiveness. Specifically, a case study in heart failure (HF) disease management was elaborated.
Discussion
STEPP is a relatively new approach to graphically explore treatment–covariate interaction with limited application in the clinical field [
23‐
26]. By using STEPP to graphically explore treatment–covariate interaction, we found that the difference in NMB between intensive support and basic support varied greatly across different, but overlapping subpopulations defined with respect to increasing levels of predicted 18-month mortality risk. The difference in NMB between care-as-usual and basic support, in contrast, never led to a clear pattern of treatment–covariate interaction. By subsequently selecting the 18-month mortality risk at which the difference in NMB between intensive support and basic support started to change signs as the cutoff to stratify patients into two risk categories, we found that compared to applying basic support to all patients, the use of a stratified approach based on offering intensive support to low-risk patients and basic support to intermediate- to high-risk patients would result in an average gain in NMB of €1312 (95% CI €390–€2346).
Our finding that more intensive multidisciplinary disease management is not beneficial in intermediate- to high-risk patients may seem counterintuitive to some readers, but is consistent with the study conducted by Pulignano et al. [
27], who concluded that “most eligible patients for a hospital-based DMP may be those at intermediate risk who are not too sick and not too healthy”. Our STEPP for care-as-usual against basic support suggests that this also holds for the moderate form of disease management that was provided in the COACH study. However, our other STEPP indicates that, compared to basic support, low-risk patients may still benefit from a more intensive form of disease management. Although there is also evidence to suggest that intensive, post-discharge disease management is unnecessary in low-risk patients [
28‐
30], our latter finding is consistent with several previously conducted subgroup analyses. Hebert et al. [
31] found that when comparing severe (NYHA class III and IV) and less severe (NYHA class I and II) patients, nurse-led disease management was more likely to be cost-effective in the less severe patients. Similarly, Miller et al. [
32], who conducted a model-based evaluation to investigate the lifetime cost-effectiveness of telephonic support for systolic HF patients, obtained a slightly less favorable cost-effectiveness ratio for this intervention after NYHA class I patients were eliminated from their study population. Finally, Goehler et al. [
33] found that the median lifetime incremental cost-effectiveness ratio increased with €15,900/quality-adjusted life year (QALY) for male patients and €600/QALY for female patients when the average age of the cohort passing through their model was increased from 55 to 75 years. When combining our results with the findings presented in these previous studies, it seems that the trade-off between a moderate or intensive form of disease management is shown especially in patients at low or intermediate risk who are not too sick to be treated. Patients at high risk, in contrast, do not seem to benefit from a more intense form of multidisciplinary disease management. The question of whether such patients should therefore only be offered a basic form of disease management is an ethical discussion that is beyond the scope of this paper.
In our analysis, we applied a previously developed multivariable risk prediction model to combine the information captured within several covariates into a single prognostic index to represent baseline risk. We subsequently used this index to explore for heterogeneity in treatment effect across different subgroups of patients. Compared to conventional subgroup analysis based on a single prognostic covariate, integrating multiple independent patient characteristics associated with the outcome parameters of interest in a multivariable risk prediction model improves risk stratification [
34,
35]. This, in turn, can greatly enhance the statistical power to detect variations in treatment benefit as was shown in a previously conducted simulation study [
36]. Moreover, the use of such a multivariable approach avoids the problem of multiple testing, resulting from the need to repeat the subgroup analysis for different individual risk factors. Thus, the chances of obtaining false positive findings are reduced [
36,
37].
While treatment-predicted risk interaction can best be assessed on a continuous scale [
38], discretization of the predicted risks into two or more ordinal categories becomes essential if we want to use the underlying risk prediction model to guide the selection of therapy. By deriving the cutoff of 0.16 from the treatment effect pattern observed in a STEPP, we were still able to make effective use of the discriminative power of a continuous prognostic index in our quest for an efficient reimbursement policy. This does not hold when applying conventional subgroup analysis based on a single prognostic covariate as we did as part of our previous economic evaluation in this patient population [
22]. When quantifying the net benefit gains of one over the other stratification basis, the subgroup strategy proposed in this study was found to outperform the previous one with an average gain in NMB of €1174 (95% CI €− 1146 to €3284).
A limitation of this study is that the cutoff of 0.16 may be specific for the data analyzed in this paper. It was selected by taking into account the pattern of treatment–risk interaction in a single clinical trial. Future research is thus required to determine to what extent this cutoff can also serve as a suitable stratification basis for other studies. Secondly, rather than using an external model (i.e., a risk prediction model developed on another dataset), we used an internally developed risk prediction model to assess the treatment effect across different subpopulations of predicted risk. The validity of this approach was recently assessed by Burke et al. [
34], who concluded that “appropriately developed internal models produce relatively unbiased estimates of treatment effect across the spectrum of risk”. In addition, these authors also found that “when estimating treatment effect, internally developed risk models using both treatment arms should, in general, be preferred to models developed on the control population”. As all treatment groups of COACH were included in the development of the COACH risk prediction model, this is exactly the strategy that we have followed in the current paper. Thirdly, because we selected the difference in NMB as the measure of treatment benefit, our results are conditional on the value assumed for the willingness-to-pay threshold. As a first paper to introduce the application of our proposed approach, we only selected a single threshold. For actual decision-making purposes, it would however be recommended to perform sensitivity analysis and repeat the approach for different values of the willingness-to-pay threshold to make sure that the risk-stratified treatment recommendation is robust with respect to the selected threshold value. Another limitation of this study is that the time horizon for the economic evaluation was restricted to the 18-month follow-up period of the COACH study, meaning that cost differences and survival benefits are likely to be underestimated. In future applications of our proposed method, one could therefore consider extrapolating the patient-level cost and survival estimates beyond the range of the trial data by applying more advanced statistical modeling techniques, such as the multi-state modeling approach proposed by Cao et al. [
39]. Finally, heterogeneity in individual patient preferences was not considered in our analysis, although it was suggested as being an important factor when developing personalized treatment recommendations [
5].
To conclude, the emerging role of health economics in personalized medicine has recently been recognized and is actively discussed [
40‐
45]. To assess how personalized medicine may maximize the net benefits, it is crucial to develop a risk-stratified treatment recommendation [
46] to ensure subgroup cost-effectiveness analysis. Recently, value of information analysis was adapted to develop stratified treatment recommendations that maximize net health benefit or NMB [
9,
10]. This technique may be useful when a model-based economic evaluation is conducted. Our proposed approach based on STEPP enables the development of stratified treatment recommendations when the economic evaluation is conducted alongside a clinical trial.