Introduction
Multiple myeloma (MM), generally considered incurable, is the second most common haematological malignancy and accounts for approximately 0.8% of all new cancer cases worldwide [
1‐
3]. The incidence and survival of cancer patients, in general, as well as of MM in particular, have increased in the past few decades, and a similar trend has been observed for the economic burden of cancer management [
4‐
7]. For this reason, and particularly under a situation of budget constraints that many healthcare decision-makers are facing, the value of cancer drugs is increasingly being scrutinised [
7,
8].
Cost-effectiveness studies, along with other health economic studies such as budget impact analyses, represent essential tools that allow healthcare managers to make evidence-based decisions regarding the value and affordability of health technologies. Randomised controlled trials (RCTs) are the gold standard to identify relative treatment effects and are well suited to produce evidence for regulatory approval; [
6], however, Sullivan et al. and Neyt et al. argue that results from cost-effectiveness analyses based solely on RCTs may not predict the benefits and costs of new treatments in real world (RW) patients and that these analyses should be supplemented with information collected from observational databases when available [
6,
9]. In fact, there are differences between RCTs and the RW that may limit the applicability of economic models based on RCTs only in RW populations: potential differences in patient selection criteria (i.e. stricter inclusion and exclusion criteria in RCTs, in general, as compared with RW studies), treatment patterns and dosing, use of supportive care and extent of follow-up (i.e. patients’ adherence to treatment tends to be better in RCTs, as compared with RW studies), or differences in care across countries, particularly in the context of oncology, are some examples [
6,
8,
10]. Observational databases, however, capture characteristics and outcomes of patients receiving treatment in real life: the Registry of Monoclonal Gammopathies (RMG), for instance, captures a wide range of data of MM patients in the Czech Republic, and comparisons across published studies demonstrate that differences exist between RCTs and the RW, e.g. outcomes of patients treated with lenalidomide and dexamethasone (Rd) are considerably lower in RW patients compared with those in recent RCTs [
11‐
16]. Additionally, the limited time duration of RCTs pose an extra hurdle for the generalisation of economic model results in the RW, as the time horizon of economic models often requires extrapolation of clinical data well beyond the trial duration; [
17] in registries and observational databases patients may be followed for longer periods and consequently the uncertainty around long-term estimates may be considerably lower than that obtained as a result of extrapolation of trial data [
9,
17,
18]. Mullins et al. claim that this RW evidence is critical for coverage decisions by payers and treatment decisions by physicians and patients, and for that reason economic models that combine the strengths of both RCTs (i.e. relative treatment effects) and RW data [i.e. baseline risks such as progression-free survival (PFS) and overall survival (OS) in patients receiving the comparator treatment] may provide more relevant and less uncertain estimates than those based on RCTs only, as long as the evidence available from observational databases is robust and representative of the RW patient population [
8,
9,
19,
20]. Therefore, this modelling approach is deemed to be appropriate to support well-informed decision-making in the RW, as it may minimise the risk of inefficient allocation of resources, including the chances of neglecting the access to more efficacious therapies erroneously considered not cost-effective, as well as the likelihood of inaccurate budget impact predictions [
8,
9,
19,
20].
Several studies have reported the RW cost-effectiveness of cancer drugs combining data from RCTs and observational databases, reinforcing the validity of the approach described above. For instance, Seferina et al. estimated the RW cost-effectiveness of trastuzumab plus chemotherapy versus chemotherapy alone in early breast cancer combining RW outcomes for the trastuzumab arm with treatment effect estimates [expressed as hazard ratios (HRs) of trastuzumab versus control arm] from the HERA trial [
21,
22]. Similarly, van Gils et al. analysed the RW cost-effectiveness of oxaliplatin in colon cancer, for which they combined published efficacy data from the MOSAIC trial with RW data from a Dutch population-based observational study [
10]. Other studies have adopted a similar approach for the estimation of RW cost-effectiveness of health technologies, including disease areas other than cancer such as cardiovascular disease or chronic obstructive pulmonary disorder [
23‐
26].
The aim of the present study was to estimate the RW cost-effectiveness of carfilzomib in combination with lenalidomide and dexamethasone (KRd) compared with Rd for the treatment of relapsed MM after one to three prior therapies. For this purpose observational data for Rd from the RMG in the Czech Republic were combined with treatment effect estimates from the ASPIRE trial, a randomised, open-label, multicentre, phase 3 study that evaluated the safety and efficacy of KRd compared with Rd in relapsed MM patients who had received one to three prior treatments [
12,
15,
16].
Discussion
The current analysis evaluated the RW cost-utility of KRd versus Rd in relapsed MM patients that have received one to three prior therapies, resulting in an ICER of €73,156 per QALY gained in the base case. The cost-utility model developed for the analysis used a partitioned survival modeling approach which is employed in a significant proportion of economic evaluations of cancer therapies. Scientifically reputable health technology assessment (HTA) agencies such as NICE have repeatedly reviewed and confirmed the appropriateness of such model structure [
29,
30]. The analysis was conducted from the payer perspective, and the Czech Republic was chosen to illustrate the model given the rich observational data sources available in the country.
For estimating the RW cost-effectiveness of KRd versus Rd, the baseline hazard of patients treated with Rd (PFS, OS and TTD) were calculated from the RMG, one of the most comprehensive and relevant registries capturing outcomes of MM patients [
16]. The KRd versus Rd HRs from ASPIRE were applied to the baseline hazard to estimate the hazard of patients receiving KRd in the RW, assuming that the relative treatment effects observed in ASPIRE are applicable in the RW. Results from the phase 3 ASPIRE trial demonstrated that the relative treatment effects are consistent across a wide variety of subgroups of relapsed MM patients, and additional statistical analyses showed no significant treatment-covariate interaction in the ASPIRE patient population [
15,
32]. This is regarded as a strong evidence base to support the applicability of trial HRs in the RW [
9]. This methodology has been previously adopted for the estimation of RW cost-effectiveness of health technologies in oncology as well as other disease areas, such as cardiovascular and respiratory diseases; [
10,
21,
22,
24‐
26] the approach has also been accepted by NICE, issuing a positive recommendation for evolocumab for treating primary hypercholesterolaemia or mixed dyslipidaemia in specific patient groups based on an economic model that combined baseline risks of cardiovascular disease from the Clinical Practice Research Datalink registry with reductions in cardiovascular events from a meta-analysis of RCTs [
58].
Neyt et al. argue that combining observational data with evidence from RCTs is a solution for handling potential differences between RW patients and RCT patients: RCTs are the gold standard for estimating relative treatment effects, whereas observational databases capture baseline risks of patients treated in RW conditions, and therefore an analysis that combines the strengths of both observational and RCT data may result in results that are more relevant for policy purposes, compared with results obtained from data collected under ideal circumstances (i.e. RCTs) only. With regard to the current decision problem, the outcomes observed in ASPIRE were substantially better than those observed in the RMG: in ASPIRE, the median PFS and OS were 17.6 and 40.4 months, respectively, for patients receiving Rd; [
15,
38] patients in the RMG, however, had median PFS and OS values of approximately 7.6 and 19.3 months, respectively (weighted values from Table
1). Similar differences were identified for treatment duration: the median TTD was 13.1 months in the Rd arm in ASPIRE, in contrast with the 6.1 months in the RMG (Table
1) [
15,
38]. These dissimilarities between ASPIRE and the RMG are likely to arise from differences in patient characteristics, treatment selection and treatment patterns between the trial and the RW. For these reasons, and given the available evidence base, the use of registry data to inform baseline risks in economic models is considered to present healthcare managers with the most relevant information package for an appropriate decision-making and avoid unrealistic budget impact predictions caused by overestimating key variables such as treatment duration. This is particularly important in MM where a number of trials that enrolled patients across the world have consistently shown better outcomes and longer treatment duration than what is achieved in the RW [
11‐
15].
The sensitivity analyses showed that the model is particularly sensitive to the parameters predicting and assumptions made around the relative treatment effect for OS associated with KRd versus Rd. However, considering that RW outcomes are not yet available for KRd, the base case is considered to represent a set of plausible assumptions.
In the current model, patients in the KRd arm were estimated to spend longer time in PFS compared with patients in the Rd arm, which in turn extended the use of lenalidomide and dexamethasone in the KRd arm (the cost of lenalidomide and dexamethasone was €41,273 versus €36,069 in the KRd and Rd arms, respectively; Table
6). Innovative therapies like carfilzomib tend to extend the use of costly therapies that have been considered cost-effective in the past (e.g. lenalidomide given on top of carfilzomib in the KRd regimen), and this could generate the perception that the innovative therapies are more expensive than they actually are [
32,
56,
57]. The currently accepted methodology for cost-effectiveness analysis does not consider the new paradigm of oncology regimens administered in combination, which represents a major hurdle to demonstrate cost-effectiveness of innovative therapies. HTA agencies such as NICE have recognised these challenges and acknowledged that some innovative therapies may not even be cost-effective at zero price, but no practical solution has been proposed and widely accepted thus far [
59]. For these reasons, one scenario analysis evaluated the cost-effectiveness of carfilzomib excluding the costs of lenalidomide and dexamethasone in both KRd and Rd arms, i.e. focusing the analysis on the introduction of carfilzomib only. The ICER was lower than that of the base case (€67,347 and €73,156 per QALY in the scenario analysis and base case, respectively), which is in line with the results shown by Jakubowiak et al. [
32]. This approach was accepted by NICE in the technology appraisal of cinacalcet, where the costs of dialysis were excluded from the base case analysis [
60,
61].
In RCTs, it is expected that the randomisation process will produce treatment groups that are balanced across the covariate levels. In reality, however, it is common to observe post hoc imbalances in covariates across treatment groups, which may have a confounding effect. In order to remove the between-patient variability associated with covariates not included as randomisation factors and increase the generalisability of the analyses, as well as allowing for the unbiased transferability to RW data, PFS and OS HRs estimated from ASPIRE were adjusted for a number of baseline covariates [
32]. A scenario analysis was conducted to quantify the impact of covariate adjustment on cost-effectiveness results by implementing the unadjusted HRs from ASPIRE, and the ICER increased from €73,156 to €93,094 per QALY [
15]. Nevertheless, the stepwise Cox models conducted on the ASPIRE patient-level data indicated that a number of covariates may have a prognostic effect on PFS and OS, and therefore the base case ICER is considered to be more precise and relevant for decision-making purposes.
Additional scenario analyses demonstrated the robustness of the model results. The assumption of equal utilities in the KRd and Rd arms, which represents a conservative assumption as described by Jakubowiak et al., only increased the ICER to €77,258 per QALY, and a similar effect on the ICER was observed when shortening the time horizon to 20 years (€80,703 per QALY) or setting the discount rate of both costs and outcomes at 5% (ICER of €83,807 per QALY). On the other hand, assuming a discount rate of 0% improved the cost-effectiveness of KRd considerably, yielding an ICER of €56,930 per QALY.
The analysis had various limitations associated with the underlying data and methods. Firstly, the review of the literature to identify some input parameters for the cost-effectiveness model was not systematic. All inputs were, however, obtained from relevant data sources (either from the pivotal clinical trial ASPIRE or local data sources in the Czech Republic) and therefore it is considered that the impact of not having conducted a systematic literature review for all input parameters is minimal. This strategy is aligned with other RW CE studies in the literature [
10,
21,
22,
62]. The PFS, OS and TTD curves were derived from data collected during a period in which, in the Czech Republic, patients were treated with lenalidomide only up to a maximum cumulative dose of 4200 mg [
56]. The model, however, assumed that patients would be treated with lenalidomide until progression, in line with the most recent decision in October 2016 by SÚKL on lenalidomide reimbursement, and costs of lenalidomide and dexamethasone were implemented accordingly [
57]. The outcomes that would have been observed if lenalidomide and dexamethasone had been given until progression may have been better than those captured in the RMG and used in the current model, and therefore the outcomes generated in the current model may be an underestimation. On the other hand, no hard stop at eight cycles (i.e. equivalent to a cumulative dose of 4200 mg assuming no dose reductions and no missed doses) or any time point afterwards was observed in the TTD curves from the RMG, indicating that the impact of the 4200 mg cap may not be sizable. With regard to AE rates, the model included rates estimated from the ASPIRE frequencies of AEs. No data on AEs were available from the RMG and therefore no further adjustment was conducted. This represents a further limitation, although the impact of AE costs on the cost-effectiveness of KRd is minimal (i.e. the incremental cost of AEs is only 0.07% of the total incremental costs of KRd compared with Rd; see Table
6). The last PFS and OS events in patients captured in the RMG happened at nearly 5 years; the KM estimates showed a probability of remaining progression-free of approximately 5% and a probability of survival of approximately 20% at about 5 years (see online resources; Supplementary Figure 1 and 2). The long-term extrapolation of PFS and OS may be seen as a key contributor to the model uncertainty particularly considering the extent of the time horizon in the base case but, taking into account the maturity of the RMG data, this long-term extrapolation is not deemed to have a large impact on results. Besides, in a recent retrospective analysis of long-term PFS and OS data of Rd patients in the RMG registry, the median PFS and OS was estimated to be 9.0 months and 18.5 months, respectively[
62]. PFS and OS at 6 years was < 5% and 20%, respectively. These values are very closely in line with the predictions of our model, therefore, we believe the PFS and OS predictions can be considered valid. Additionally, a scenario analysis looked into the impact of shortening the time horizon to 20 years and demonstrated that the choice of time horizon does not have a large impact on the cost-effectiveness results. Other limitations, such as the uncertainty around the utility estimates, have been discussed by Jakubowiak et al. [
32].
The cost-effectiveness analysis by Jakubowiak et al. compared KRd versus Rd in relapsed MM from a US perspective, with an ICER of $107,520 per QALY [
32]. The authors estimated that patients treated with KRd would benefit from 1.99 incremental LYs and 1.67 incremental QALYs compared with Rd, in contrast with the incremental 0.99 LYs and 0.88 QALYs estimated in the current model [
32]. Larger differences can be observed when absolute LYs and QALY estimates are compared, despite the similar relative improvement in LYs and QALYs between the two analyses [
32]. This reinforces the value of using RW data in cost-effectiveness analyses to avoid estimations that diverge from observed outcomes in the RW. However, these seemingly disparate results can be primarily explained by one key difference in the modelling approach between the two models: the data source used for calculating the PFS, OS and TTD curves. Jakubowiak et al. derived these curves for both KRd and Rd arms by fitting joint parametric models to the ASPIRE trial data; registry data (collected from the US Surveillance, Epidemiology, and End Results registry) were only used for the extrapolation of the Rd OS curve after the time of the last death event in the Rd arm in ASPIRE, and the OS HR was then used to estimate the corresponding OS curve for patients in the KRd arm.
In summary, this analysis showed that cost-effectiveness models of health technologies in the RW can generate policy-relevant results when the strengths of both RCTs and powerful observational databases are combined. The current model showed that KRd is likely to be cost-effective versus Rd in the RW population (MM patients with one to three prior therapies), with an ICER of €73,156 per QALY and these results, along with the cost-effectiveness analysis conducted by Jakubowiak et al., confirm that KRd is likely to be cost-effective versus Rd both in the clinical and RW settings [
32]. Therefore, the reimbursement of KRd for this patient population represents an efficient allocation of resources within the healthcare system.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.