Background
Network meta-analyses (NMA) are increasingly being performed to inform decision-making regarding the comparative efficacy and safety of alternative treatments [
1]. In order to determine the comparative efficacy or safety of a new treatment using a NMA it is necessary to establish the relevant comparators. Generally, the indication for the new treatment and the way in which the new treatment is expected to be used in clinical practice will determine the comparators of interest. In some cases the comparators are explicitly defined by reimbursement agencies for a technology appraisal, which is the case in the United Kingdom where the National Institute for Health and Care Excellence (NICE) develops a final scope based on a stakeholder consultation process [
2].
In order to inform decision-making it is necessary to assess whether it is feasible to perform a valid NMA to compare the new treatment with usual care based on the available randomized controlled trials (RCTs). As with any NMA, the validity of such analysis relies on whether there are systematic differences among the studies included in the network across treatment comparisons, especially patient or disease characteristics that are treatment effect modifiers [
3‐
6]. Although there is guidance available regarding the underlying principles of an NMA, there is a need for a more structured process that incorporates both clinical and methodological expertise to assess the feasibility of performing a valid NMA [
7]. The aim of this study is to outline a general process for assessing the feasibility of performing a valid NMA. A case study is used to illustrate the feasibility of performing an NMA to compare everolimus in combination with hormonal therapy to alternative chemotherapies in terms of progression-free survival (PFS) for women with advanced breast cancer (ABC).
The first section presents general steps for assessing the feasibility of a NMA. Next, the case study is presented in terms of the background, the identification and selection of trials, the method for the systematic review and analysis, and the results of the feasibility assessment and NMA. Readers are encouraged to use our application of these rules to our clinical example as a case study in applying our process to a possible research question and may envision ways to apply it to their own review.
Discussion
The aim of this study was to propose a more structured process to assess the feasibility of performing a valid NMA. The suggested procedure builds on existing recommendations for NMAs and provides more explicit guidance regarding the questions that should be answered at each step. The process is designed to be stepwise, with the initial stages focused on the clinical differences (that is, related to treatments, outcomes, study design and patients) and the later stages focused on evaluating the observed outcomes. Parts A and B involve an assessment of the clinical heterogeneity in terms of treatment, outcome, study and patient characteristics. Parts C and D involve an evaluation of the differences within and across the direct pairwise comparisons in terms of baseline risk and observed treatment effects. This means that it may be decided that an NMA is not feasible after the initial stage, without having assessed heterogeneity or inconsistency. If the decision is made to complete the full feasibility assessment, the available data should be illustrated and the underlying assumptions should be clearly stated, thereby improving the transparency and facilitating an interaction between methodologists and clinicians. While this process does not avoid the need for subjective decisions, it allows decision-makers or researchers to critically analyze each choice as well as to update an analysis using a different approach without necessarily having to repeat the entire process.
The final step of any NMA is to critically assess the findings. Recently guidance to facilitate this process has been developed, including the International Society for Pharmacoeconomics and Outcome Research ‘instrument to assess the relevance and credibility of a NMA’ [
70], a ‘reviewer’s checklist’ for evidence synthesis for treatment efficacy used in decision-making [
71], as well guidance on ‘how to use an article reporting (Grading of Recommendations and Evaluation (GRADE)) a multiple treatment comparison meta-analysis’ [
72]. Additionally, the GRADE process to assess meta-analyses has recently been updated by Cochrane to address the NMA more specifically. The current process for assessing the feasibility does not provide explicit guidance regarding the types of tools to be used for this process, but there seems to be a shared focus on some key principles that should be assessed, including the magnitude of the treatment effects, the uncertainty in the estimates and the risk of bias due to the quality of the RCTs as well as any differences in the distribution of treatment effect modifiers across direct treatment comparisons.
In the case study comparing everolimus to alternative chemotherapies in terms of PFS for women with ABC the feasibility of the NMA was determined to be limited. Although it was possible to achieve a connected network of RCTs for the comparisons of interest, differences were identified in terms of the treatment doses and the outcome definitions, which could be explored by excluding outlier studies. However, differences were also identified with respect to the pre-defined treatment effect modifiers as well as post-hoc differences in specific patient characteristics that were not possible to explore based on the available data. Some variation in baseline risk within trials including TAM was observed, as was some heterogeneity in the treatment effects, whereas the inconsistency was challenging to assess in this network. In conclusion, given the differences identified in potential treatment effect modifiers which cannot be explored, there is a substantial risk that differences in these potential treatment effect modifiers may introduce bias, threatening the overall validity of the NMA, which reflects a limitation of the available data. Despite the limited feasibility of the case study, it was decided to perform the NMA for exploratory purposes. The point estimates from the analysis suggest that everolimus in combination with EXE or TAM is at least as efficacious as the chemotherapies of interest in terms of PFS. However, the comparison of interest is linked through several indirect treatment comparisons, which led to substantial uncertainty in the treatment estimates. We would advise caution regarding the interpretation of the results given the conclusion of the feasibility assessment.
The decision to proceed with the NMA can be criticized in light of the findings from the feasibility assessment. However, there is an immediate need for evidence from decision-makers given the context of the research question, as well as a potential long-term gap in the evidence, which suggests this NMA may provide the best available evidence. For example, findings from the current NMA may provide a more robust result based on the available evidence in comparison to a previous ‘naïve chained indirect analysis’ that multiplied a pooled hazard ratio for chemotherapy versus endocrine therapy (from the meta-analysis by Wilken
et al.) by a hazard ratio for everolimus in combination with TAM versus TAM (based on the TAMRAD trial and assumed to be the same as everolimus in combination with EXE to EXE) [
73]. Although there is a risk that results of the NMA will be over-interpreted, we would argue that the purpose of the feasibility assessment is to ensure that the underlying assumptions and limitations of the NMA are clearly communicated. Further, NMA results may help to quantify the between-study variability (and possibly the inconsistencies in the evidence base in some cases), thereby providing a more complete exploration of heterogeneity, which may generate further hypotheses [
7]. Finally, in some cases, results of an NMA may actually help to trigger a response from clinical experts regarding the plausibility of the underlying assumptions, which may otherwise be more difficult to reveal. In general, we would advise consideration regarding the value of exploratory analyses against the risk of over-interpretation.
The case study of everolimus for women with ABC provides a unique opportunity to illustrate the challenges associated with evaluating the feasibility of a NMA given that this new treatment reflects a step-change in clinical practice. In such cases where a new treatment introduces an additional step in the traditional treatment pathway, it may be necessary to compare the current treatment pathway (in the absence of the new treatment) with the anticipated treatment pathway (including the new treatment). When there are no trials available comparing the current treatment pathway to the anticipated treatment pathway, this often implies a comparison between the new treatment and the usual treatment used as the ‘next step’ in the treatment pathway. However, by definition, a new treatment that delays the next step in a treatment pathway is designed to target a less severe population. Consequently, there is an inherent risk that the patient characteristics of the RCTs available for the new treatment are not comparable to those patients in the RCTs evaluating the ‘next step’. Additionally, as new treatments become more targeted based on genetic differences in receptors, it may be difficult to compare new trials evaluating a subset of patients with older trials including a full population (that may not report the receptors of interest). Despite these limitations, it may be decided to combine the direct and indirect results and to perform an NMA given the absence of evidence regarding the comparison of interest and the need for clinicians and health technology bodies such as NICE to make decisions. The tendency to perform an NMA in the context reinforces the importance of the feasibility assessment process. Moreover, this case study identifies a clear need for a new trial comparing the everolimus to chemotherapy, or a comparison of the alternative treatment pathways with and without everolimus (that is, everolimus followed by chemotherapy versus placebo followed by chemotherapy).
One of the main limitations of the case study is that overall survival was not assessed. The current study focused on PFS given the available data for everolimus at the time of the feasibility assessment. In comparison to overall survival, PFS is not susceptible to confounding by differences in subsequent treatments across the studies, although there is a risk of assessment bias with PFS. Therefore, overall survival, as well as the safety and adverse events of these agents, should be considered in addition to the results of the current NMA. Another limitation is that the current case study was based on a research question focused on the comparison of everolimus versus chemotherapy. However, the original scope of the research question as defined by NICE also included fulvestrant as a comparator of interest. A separate NMA has been performed by Bachelot
et al. in order to address this comparison of interest among women with ER+ ABC following progression or recurrence after endocrine therapy [
74]. Although ideally all of the comparisons of interest should be included in one simultaneous analysis, there is a clear justification for a separate analysis in this case given the challenge of comparing everolimus to chemotherapy.
It should be noted that this feasibility process has some limitations. In the initial stages (parts A and B) it may not be necessary to extract the outcomes of interest from all studies, thereby improving the efficiency of the process. However, it is necessary to assess whether there is a sufficient amount of information reported regarding the outcome and its measure of uncertainty, which requires decision rules regarding the calculation of treatment differences or the estimates of uncertainty that may be particularly challenging to define
a priori for continuous endpoints depending on the available information. Similarly, if imputation will be used to assess uncertainty measures, a threshold regarding the amount of missing information that will be permitted may be necessary. However, pre-specifying decision-rules for all possible types of endpoints, including optimal thresholds for the amount of data required for covariate analyses may be challenging. Although some research has evaluated alternative imputation methods for NMAs [
75], to our knowledge alternative thresholds for missing data depending on the type of outcome requires further research.
Although the current case study was based on a complex network structure, in ‘star’ shaped networks, involving several trials with a common comparator (such as placebo), we would emphasize the importance of assessing whether differences in baseline risk exist and can be adjusted (part C). A plot of the difference measure versus the baseline risk is useful to help illustrate the variation in the baseline risk, as well as the relationship between the difference and baseline risk for each treatment. Even in cases where head-to-head trials are included in the network, it is possible to predict a placebo-arm on the basis of the other trials [
15].
The current framework suggests a separate process for each outcome (and time point) of interest based on evidence available from RCTs identified from a systematic review regarding a comparative efficacy or safety question. However, undergoing the outlined feasibility process is expected to be very time consuming, and it may be more realistic to assess multiple outcomes in parallel, particularly when they are related to the same endpoints or underlying concepts. The case study explores the feasibility of a NMA based on a synthesis of Kaplan Meier curves; however, this process can be applied to any type of endpoints (that is, binary, continuous or rate outcomes). For binary endpoints it may be important to consider whether differences in follow-up are expected to act as a treatment effect modifier and, if so, to what extent different follow-up (or time points) can be combined. Similarly, for continuous outcomes, the range of time points at follow-up than can be considered comparable should be clearly addressed. It may also be important to consider models that combine multiple time points (repeated measures) or outcomes identified within the systematic review, particularly in cases where the initial feasibility assessment suggests a NMA may not be feasible. For example, in the context of ABC, a multi-state model that accounts for PFS and overall survival (as well as the relationship between the outcomes) may provide more information (and possibly more precision).
Another possible extension of the current process would be to consider a broader evidence base if a network is deemed not to be feasible. Depending on how the research question was defined, it may be important to assess whether additional indirect evidence may be available by broadening the comparators of interest, although this consideration should be offset by the risk of introducing different populations in terms of the distribution of treatment effect modifiers. Similarly, it may be possible to integrate non-randomized evidence using more informative prior distributions [
4] or individual patient data from RCTs [
5,
76‐
78] or non-randomized studies [
79‐
81], which may influence the feasibility of an analysis. Furthermore, it may be possible to elicit bias distributions when there is insufficient data for a meta-regression where experts provide information regarding internal and external biases in order to adjust the study-specific treatment effect [
82] as cited in [
20]. However, these methods to combine multiple time points, outcomes and study designs are evolving currently and require further research. Therefore, the current process may provide a useful starting point to identify the need for a more complicated approach.
Competing interests
SC, and BS are employees of Mapi and received funding from Novartis for the study. JJ is a former employee of Mapi and received funding from Novartis for the study. JZ and SS are full time employees of Novartis Pharmaceutical Corp and have shares in the company. PS received no compensation for working on this manuscript. PS has no competing interests.
Authors’ contributions
All authors participated in the development of this manuscript. SC, JJ and BS participated in all stages of the study design, systematic literature view, statistical analyses and manuscript development. JZ, SS, and PS participated in the study design and coordination and helped to draft the manuscript. All authors read and approved the final manuscript.