Introduction

Osteoporosis is among the most consequential health crises for industrialized countries with aging populations. In 2000, osteoporosis was the cause of more than 8.9 million fractures [1]; of these fragility fractures, 1.6 million involved the hip; 1.7 million, the forearm; and 1.4 million, the vertebrae [1]. In the absence of meaningful preventive measures, approximately one half of Americans aged > 50 years will be at high risk of osteoporosis by 2020 [2]. Direct costs from osteoporotic fractures in men and women are projected to exceed $25 billion by 2025 in the USA, and the cumulative cost of incident fractures is expected to reach $228 billion for the 10-year period between 2016 and 2025 [3]. In 2010, the estimated direct costs of osteoporotic fractures in the five largest EU countries (France, Germany, Italy, Spain, and UK) amounted to €29 billion, and €38.7 billion in the 27 EU countries [4].

In view of these realities, it is imperative to identify therapeutic agents that can effectively prevent fragility fractures due to osteoporosis, especially in postmenopausal women. The roster of established treatments includes antiresorptive agents, anabolic agents, and agents that have both anabolic and antiresorptive properties. Abaloparatide is an anabolic agent that selectively activates the parathyroid hormone 1 (PTH1)–receptor signaling pathway and stimulates bone formation [5]. Abaloparatide is a treatment for postmenopausal women with osteoporosis who are at high risk for fracture or who have failed, or are intolerant to, other available osteoporosis therapies [6]. In the 18-month Abaloparatide Comparator Trial In Vertebral Endpoints (ACTIVE; NCT01343004), women with postmenopausal osteoporosis (PMO) who were randomized to receive daily subcutaneous injection of abaloparatide (80 μg; n = 824), placebo (n = 821), or open-label teriparatide (20 μg; n = 818) experienced significant reductions in the risk for new vertebral fractures (VF) and for nonvertebral fractures (NVF): Risk reduction for VF was 86% (p < 0.001), and risk reduction for NVF was 43% (p = 0.049) [7]. A prespecified exploratory analysis revealed that abaloparatide was also associated with a significantly reduced risk for major osteoporotic fractures versus placebo (70%; p < 0.001) and teriparatide (55%; p = 0.03) [7]. Risk reduction persisted in these anatomic sites in the extension study of ACTIVE (ACTIVExtend; NCT01657162) after patients on abaloparatide or placebo were switched to alendronate for 24 months [8].

Given the array of agents available for preventing osteoporotic fractures, analysis of their relative safety and efficacy can benefit clinicians looking to individualize treatment for patients. This prompted us to conduct a network meta-analysis (NMA) to identify which osteoporosis treatments exhibit optimal efficacy in postmenopausal women at high risk for fragility fracture. Network meta-analyses employ systematic literature reviews (SLRs) to compare multiple treatments directly within randomized controlled trials (RCTs) and indirectly across trials using a common comparator. Regulatory authorities require that new osteoporosis treatments demonstrate their effect on VFs and NVFs, so we focused our analysis on these events [9].

Methods

We undertook an SLR to identify all relevant RCTs involving abaloparatide and all pertinent comparators. The main clinical endpoint was the relative risk (RR) of abaloparatide versus placebo and other available treatments to reduce fracture risk. Treatment ranking according to performance for each outcome was the secondary endpoint. The global ranking matrix was populated by the proportion of simulations in which each treatment is ranked best (i.e., the treatment is associated with the smaller risk of fractures), second best, third, and so on.

Study selection

Three electronic databases—PubMed®, Embase®, and the Cochrane Central Register of Controlled Trials (CENTRAL)—were searched for RCTs published prior to December 20, 2017. Studies were selected that met predefined eligibility criteria based on populations of interest (inclusion/exclusion criteria), interventions (drug dosage/frequency), and outcomes (fracture assessment). The population of interest included women with PMO who were eligible to receive pharmacotherapy for primary or secondary prevention of fractures. Nine comparators were selected: eight (alendronate, denosumab, ibandronate, raloxifene, risedronate, strontium ranelate, teriparatide, and zoledronic acid) on the basis of National Institute for Health and Care Excellence recommendations of available osteoporosis treatments plus the investigational treatment romosozumab. Search terms were specific to disease, type of study, drugs, combined free text, and Medical Subject Headings (MeSH). We restricted the search to English-language publications. Exclusion criteria for the literature review included non-RCTs, phase 1 trials, letters, editorials, case reports, comments, studies not involving humans, and trials reporting only bone mineral density (BMD), a surrogate for treatment response.

Feasibility assessment

Group data for the NMA were obtained for four types of fracture: VF, NVF, wrist, and hip. Studies that did not provide sufficient information to allow derivation of RR or that reported zero events were excluded. Determination of the RR of fracture was contingent on studies reporting the number of patients in each study arm (N) and the number (n) or percentage of patients with fractures in each study arm.

A secondary article review was conducted to identify studies that reported on the RR of fragility fractures and that would therefore qualify for the indirect treatment comparisons. With the exception of romosozumab studies, only trials examining licensed dosages of a single agent for postmenopausal osteoporosis were included, as defined by the European Medicines Agency. Studies that included both a licensed and a nonlicensed dosage were considered only if it was possible to separate fracture outcomes in study arms by dosage. Dose-ranging studies without a control arm and switching studies assessing only a sequence of treatments were excluded, as were studies comparing the same active drug but only assessing the addition of a supplement in 1 of the study arms. Major osteoporotic fractures and clinical fractures were not considered for the NMA, because their definitions varied widely across studies.

After the secondary article review, an assessment was made to see if it was feasible to connect networks between treatments. The patient population characteristics of the remaining studies were then reviewed for differences in age, ethnicity, BMD at baseline, previous fractures, fracture definition and assessment, outcome measures (efficacy or safety outcomes), and previous treatments to assure the exchangeability of patient data across trials. Studies that exhibited heterogeneity factors in these patient population criteria (Supplementary Fig. 2) that would prevent adequate exchangeability of patient data across trials were excluded from the base case analysis.

Meta-analyses

We ran pair-wise meta-analyses (performed in Stata 14.1) when data comparing the same two treatments for the same outcome were available. Results from these comparisons were used primarily to check for statistical heterogeneity and for inconsistencies in the results obtained from the indirect treatment comparisons. Results were pooled by means of fixed-effects models that used the Cochran-Mantel-Haenszel method [10] and random-effects models that used the DerSimonian and Laird model [11]. The estimate of heterogeneity was taken from the Cochran-Mantel-Haenszel method and was assessed by both Cochran’s Q test and I2 statistics.

Differences in trial duration were taken into account, because the time periods during which individuals experienced at least one fragility fracture varied significantly across studies. An underlying Poisson process was, therefore, assumed for each trial arm, so the time until a fracture occurred followed an exponential distribution. Network analyses were implemented using both fixed-effect and random-effects approaches. The models were fitted to the data using Bayesian Markov Chain Monte Carlo methods (Gibbs sampling) and were implemented in WinBUGS 1.4.3 (University of Cambridge MRC Biostatistics Unit, Cambridge, England, UK).

Model selection

As the validity of the results relied on the model converging in a satisfactory manner, we made a visual assessment at the end of each simulation using history trace plots, Brook-Gelman-Rubin plots, smoothed kernel posterior density plots, and autocorrelation plots. Both fixed-effects and random-effects approaches were run for each model, with the most appropriate model being selected on the basis of the total residual deviance and deviance information criterion. Using this information, we concluded that the fixed-effects model was a better fit for the data. The Cochran Q test and I2 statistics generated for each pair-wise comparison in each network confirmed this assumption in most cases.

Sensitivity analyses

Each of the networks was subjected to two sensitivity analyses. One analysis assessed the effect of removing strontium ranelate trials from the networks, because the drug’s manufacturer ceased its distribution in 2017 [12]. A second sensitivity analysis assessed the impact of studies with low or ambiguous quality of evidence (e.g., studies reporting fracture outcomes as adverse events) on the NMA findings by excluding the Alendronate Phase III Osteoporosis Treatment Study Group (APOTSG) [13], EVista Alendronate Comparison (EVA) [14], and Bone Mineral Density–MultiNational (BMD-MN) trials [15].

Results

Study characteristics

After removing studies duplicated in the three databases (n = 3054), we screened 4978 articles on the basis of title and abstract (Supplementary Fig. 1). An additional 4252 articles were excluded for failing to meet inclusion criteria, leaving 726 articles for full-text screening. Twenty-nine of these could not be found. Of the remaining 697 publications that underwent full-text review, 76 articles were included for data extraction.

Following secondary article review, 56 distinct studies emerged for use in the indirect treatment comparisons. Of these, 25 studies reported on VF, 25 on NVF, 18 on hip fracture, 11 on wrist fracture, 17 on clinical fractures, and 4 on major osteoporotic fractures and were deemed suitable for the networks (Supplementary Fig. 2). Thirty-four studies qualified for data extraction; however, 12 were excluded because their patient population characteristics were considered to be too different from those of patients in ACTIVE or because they focused on clinical or major osteoporotic fractures. Thus, a total of 22 studies remained for inclusion in the 3 fracture networks (VF, NVF, and wrist; Supplementary Table 1). Baseline characteristics of studies included in the networks and reasons for excluding studies from the networks due to heterogeneity factors are shown in Supplementary Tables 2 and 3.

The time periods during which individuals experienced at least one fragility fracture ranged from 12 to 60 months post-baseline across studies in the NMA.

Vertebral fractures network

Of the 25 RCTs providing suitable VF data, 7 studies were excluded because their patient population characteristics were considered to be too different from those of patients in ACTIVE (Supplementary Table 1). The final analysis therefore comprised 18 VF studies comparing 11 treatments in 40,901 women with PMO (Supplementary Fig. 3).

Figure 1 presents relative risk data with placebo as the reference treatment. All treatments exhibited superior efficacy to placebo, and all treatment effects were statistically significant versus placebo for preventing VF (p < 0.05). Abaloparatide had the greatest effect versus placebo (RR = 0.13; 95% credible interval [CrI] 0.04–0.34), followed by teriparatide (RR = 0.27; 95% CrI 0.20–0.37) and zoledronic acid (RR = 0.29; 95% CrI 0.23–0.36). Using abaloparatide as reference treatment, abaloparatide was significantly more effective compared with strontium ranelate and all oral bisphosphonates; however, no significant differences emerged versus teriparatide, denosumab, raloxifene, zoledronic acid, or romosozumab (Supplementary Fig. 4). Abaloparatide was ranked first among all treatments, with a 79% estimated probability of being the most effective agent for preventing VF (Fig. 2a). The second highest estimated probability, 29%, was accorded to teriparatide.

Fig. 1
figure 1

Relative risk of treatments versus placebo in the vertebral fractures network. Treatment effects were significantly different for all treatments versus placebo

Fig. 2
figure 2

Treatment ranking of osteoporosis treatments in the network meta-analysis. 11 osteoporosis treatments in the vertebral fracture network (fixed-effects model)  (a). 11 osteoporosis treatments in the nonvertebral fracture network (fixed-effects model) (b). 8 osteoporosis treatments in the wrist fracture network (fixed-effects model) (c) Circles denote highest probabilities for each treatment

Nonvertebral fractures network

Four of the 25 RCTs providing suitable NVF data were excluded because their patient population characteristics were considered to be too different from those of patients in ACTIVE (Supplementary Table 1). Thus, 21 studies remained, comparing 11 treatments in 62,606 women with PMO for inclusion in the final NVF analysis (Supplementary Table 1, Supplementary Fig. 5).

Figure 3 shows relative risk data with placebo as the reference treatment. All treatments except ibandronate had a beneficial treatment effect in preventing NVF relative to placebo, although the effect for raloxifene was not statistically significant at the p < 0.05 level. Abaloparatide had the greatest treatment effect (RR = 0.50; 95% CrI 0.28–0.85), followed by teriparatide (RR = 0.62; 95% CrI 0.47–0.82) and romosozumab (RR = 0.64; 95% CrI 0.49–0.81). Abaloparatide was also significantly more effective compared with ibandronate and strontium ranelate (Supplementary Fig. 6). Abaloparatide was ranked first among osteoporosis treatments, with a 70% estimated probability of being the most effective agent for preventing NVF (Fig. 2b); teriparatide was ranked second with a 44% probability.

Fig. 3
figure 3

Relative risk of treatments versus placebo in the nonvertebral fracture network. aAbaloparatide effect significantly different from network treatment

Wrist fractures

Of the 11 RCTs providing suitable wrist data, 1 was excluded because its patient population characteristics were considered to be too different from those of patients in ACTIVE (Supplementary Table 1). Ten RCTs comparing 8 treatments in 24,523 women with PMO were included in the wrist fractures network (Supplementary Table 1, Supplementary Fig. 7). Figure 4 presents relative risk data with placebo as the reference treatment. Beneficial effects in preventing wrist fracture relative to placebo were statistically significant for abaloparatide and alendronate only (p < 0.05). Abaloparatide had the greatest treatment effect (RR = 0.39; 95% CrI 0.15–0.90), followed by alendronate (RR = 0.46; CrI 0.29–0.70) and raloxifene (RR = 0.63; CrI 0.20–2.09). Abaloparatide was also significantly more effective at preventing wrist fracture than strontium ranelate (Supplementary Fig. 8). Abaloparatide was ranked as having a 53% estimated probability of being the most effective treatment to prevent wrist fracture (Fig. 2c); alendronate was ranked second with a 47% probability.

Fig. 4
figure 4

Relative risk of treatments versus placebo in the wrist fracture network. aAbaloparatide effect significantly different from network treatment

Sensitivity analyses

Excluding strontium ranelate from the main analysis occasioned very minor changes in the results (Supplementary Tables 4, 5, and 6). Similarly, exclusion of studies providing low or unclear quality of evidence had a minimal impact on NMA findings.

Hip fractures

In ACTIVE, there were two hip fractures in the placebo group and zero in the abaloparatide group [7]. As ACTIVE was the only abaloparatide study included in the NMA, the absence of events in the abaloparatide group caused convergence issues in the hip network. Attempts to compensate for the lack of hip event data using methods from pairwise frequentist meta-analyses resulted in estimated treatment effects that lacked sufficient precision for inclusion in this study.

Discussion

Meta-analyses, although a critical tool for analyzing results of multiple independent studies, can only make pair-wise comparisons of treatments. NMAs, by contrast, synthesize information over a network of comparisons to assess the relative effects of multiple interventions used for the same condition. NMA employs both direct and indirect evidence in a general statistical framework to generate estimates that integrate all available data [16]. In bypassing the limitations of traditional pair-wise meta-analyses, NMAs enable researchers to rank the relative efficacy of all interventions, including those that have not been compared directly in head-to-head trials. NMAs therefore provide crucial information to clinicians.

Our NMA was undertaken to provide evidence regarding the relative efficacy of 10 treatments for postmenopausal women with osteoporosis who are at high risk of fragility fractures. The final analysis comprised 22 RCTs yielding usable data for 3 types of fractures: 18 RCTs with data on VF, 21 with data on NVF, and 10 with data on wrist fracture. The individual networks revealed that abaloparatide offered postmenopausal women with and without prior fractures the greatest treatment effect relative to placebo, compared with other available treatment options for all fracture types under consideration. Furthermore, treatment ranking indicated that abaloparatide had the highest estimated probability of preventing fractures in each network: 79% for VF, 70% for NVF, and 53% for wrist fracture. Of note, each network demonstrated a good level of agreement with the direct trial evidence and direct pair-wise comparisons.

The validity of NMAs depends on the comparability of patients across trials. This means that all included studies are measuring the same relative treatment effects, and any observed differences are due to chance. Put differently, all treatments could have been included in the same study and, therefore, could be viewed as truly competing interventions [17]. By this standard, we believe that our NMA is valid. The strength of our NMA rests on several factors, key among them being the use of a systematic and comprehensive approach to capturing data. Systematic reviews of RCTs are germane to the development of evidence-based medicine and yield high-quality information when performed in a rigorous manner.

The robustness of our RCT data was confirmed by the Cochrane Risk of Bias Tool and dual reviews of bias assessments. Additionally, the RCTs underwent both a feasibility assessment and a secondary review to validate their fitness for indirect treatment comparisons. The heterogeneity analysis indicated low or no heterogeneity between studies except when studies on strontium ranelate were included in comparisons of the vertebral and nonvertebral fractures networks. In these cases, the high or moderate heterogeneity may have been due to clinical or methodologic differences between the TReatment Of Peripheral OSteoporosis (TROPOS) [18] and Spinal Osteoporosis Therapeutic Intervention (SOTI) trials [19], as patients in the TROPOS trial were older than patients in the SOTI trial (mean age 76.7 years and 69.4 years, respectively) and had a different history of VF at baseline. To further ensure the quality of the data, we ran two sensitivity analyses: The first one assessed the effect of removing the strontium ranelate studies because of the withdrawal of the agent; the other assessed the impact of studies with low or ambiguous quality of evidence. Both analyses demonstrated a minimal impact on NMA findings. We selected Bayesian Markov Chain Monte Carlo methods to fit the models to the data, because a Bayesian NMA provides a flexible framework by which to allow for complexity in the data [16].

To our knowledge, there have been five NMAs of osteoporosis treatments previously published in peer-reviewed journals, none of which included abaloparatide [20,21,22,23,24]. Recently, an NMA including abaloparatide was prepared for the California Technology Assessment Forum (CTAF), a core program of the Institute for Clinical and Economic Review (ICER) that publicly evaluates objective evidence reports and recommends how evidence can be used to improve the quality and value of health care [25]. This NMA was published in an ICER evidence report and included two anabolic agents, abaloparatide and teriparatide, as well as zoledronic acid [25]. The RCTs used in the networks enrolled postmenopausal women at high risk for a fragility fracture, and all were placebo-controlled [25]. Findings largely corroborate those from our research. In the VF network, all active treatments performed significantly better than placebo, with the RR of VF fracture being 0.13 (CrI 0.03–0.33) for abaloparatide, 0.17 (CrI 0.09–0.29) for teriparatide, and 0.30 (CrI 0.24–0.37) for zoledronic acid [25]. In the NVF network, each of the active therapies significantly reduced fracture risk compared with placebo, with the RR of NVF being 0.51 (CrI 0.28–0.85) for abaloparatide, 0.61 (CrI 0.41–0.88) for teriparatide, and 0.75 (CrI 0.64–0.87) for zoledronic acid [25]. Wrist fracture data were not reported, and the benefits of treatments for hip fracture were judged to be uncertain. The CTAF report concluded that, when active agents were compared with placebo, there was (1) moderate certainty that anabolic agents provided a small or substantial net health benefit and (2) high certainty that they provided at least a small net health benefit. These conclusions were based on the NMA showing a substantial reduction in VF and a small-to-moderate reduction in NVF [25].

Our systematic review and NMA have several limitations. First, we confined our data searches to English-only publications, which meant that 35 non-English publications were excluded; however, that particular restriction has not been shown previously to bias systematic reviews and meta-analyses [26]. Though several studies were excluded based on heterogeneity factors as described above, differences in the designs and population characteristics of the studies included in the NMA represent a second limitation. Additionally, comparators evaluated in our NMA were restricted to those included in National Institute for Health and Care Excellence recommendations of available osteoporosis treatments plus the investigational treatment romosozumab. Consequently, agents such as bazedoxifene, with regulatory approval in a small number of countries, are not taken into account. A possible second limitation of our NMA is that we did not consider adverse events or drug costs. It is well-known that adverse events affect adherence to bisphosphonate therapy, with one large observational study reporting only 45% compliance 1 year after initiation and only 30% compliance at 2 years [27]. Adherence to anabolic agents appears to be better, despite the requirement for daily subcutaneous injection. A small (N = 111) retrospective chart review showed the persistence rate with teriparatide to be 90% at 6 months and 75% at 18 months [28]. Only 20% of patients in that study cited adverse events as the reason for nonadherence.

Healthcare economics are especially relevant in the evaluation of hip fractures, which are associated with significant short-term morbidity, long-term loss of independence, nursing home placement, and increased mortality [29]. Hip fractures skew the cost distribution of fragility fractures by accounting for 72% of total outlays but only 14% of all fractures [3]. The absence of hip fracture data suggests a third limitation of our NMA, but the absence of events in the abaloparatide arm of ACTIVE [7] precluded a meaningful assessment of treatment effects on hip fractures using NMA. However, evidence from the abaloparatide extension trial points to a possible protective effect of abaloparatide treatment. ACTIVExtend enrolled 558 women from the abaloparatide group and 581 from the placebo group, all of whom were switched to alendronate, 70 mg weekly for 2 years [8]. The cumulative incidence of hip fracture after 24 months was 5 in the group switched from placebo and zero in the group switched from abaloparatide, implying long-term risk reduction with active treatment. Additionally, a recent observational study of patients taking teriparatide (N = 14,284) reported a statistically significant decrease in hip fracture among patients who persisted longer with treatment or had higher adherence [30]. Regardless of whether these findings represent a class effect of anabolic agents, the two reports taken together are, at the least, encouraging.

Conclusions

Given the high level of morbidity and healthcare costs associated with fragility fractures, it is imperative to identify therapeutic agents and other treatment modalities that can reduce fracture risk. In this NMA of 10 pharmacotherapeutic agents used to treat osteoporosis, abaloparatide reduced the relative risk of VF, NVF, and wrist fracture versus placebo in postmenopausal women with or without prior fracture, compared with other treatment options. Generalizability of the findings is limited to the trial populations included in our NMA. Additionally, clinicians need to be somewhat skeptical of the conclusions of analyses that involve only indirect comparisons rather than head-to-head comparisons.

It should be noted that although anabolic agents may have a significant role in preventing fragility fractures, regulatory authorities limit the use of anabolic drugs for postmenopausal osteoporosis to 18 to 24 months [31, 32]. Furthermore, at least one study has shown that BMD gains from anabolic agents are quickly lost in the absence of follow-up therapy [33]. However, emerging evidence suggests that optimal sequencing of therapies is the key to preserving gains made on anabolic agents [34,35,36,37]. Consistent with this hypothesis, ACTIVExtend showed cumulative benefit from treatment with abaloparatide followed by an antiresorptive agent [8]. Additional fracture endpoint studies are awaited to help guide selection of the most appropriate agents and ideal duration of follow-up treatment.