Introduction
Atopic dermatitis (AD), a common chronic inflammatory skin disorder characterized by eczematous, lichenified lesions and intense pruritus, usually appears in childhood and is often associated with comorbidities such as asthma and allergic rhinitis [
1‐
3]. AD affects 15–20% of children (< 18 years of age) and 1–3% of adults [
4]. Most patients with AD suffer from mild-to-moderate disease [
5‐
7].
The goal of AD management is the prevention and care of disease flares. US treatment guidelines recommend topical corticosteroids (TCSs) and/or topical calcineurin inhibitors (TCIs) as well as phototherapy for mild-to-moderate AD and immunosuppressants or biologics for moderate-to-severe/refractory disease [
8]. Although there are safety concerns with the prolonged use of high-potency TCSs [
9], a more significant problem is nonadherence to therapy because of fear of skin atrophy, which in turn leads to poor disease control [
10]. While TCIs reduce AD severity, special warnings highlight a possible risk for lymphoma and skin cancer, and application site reactions may reduce its use [
3,
8,
11].
Crisaborole is a nonsteroidal topical phosphodiesterase 4 inhibitor (PDE4i) that acts by regulating inflammatory cytokine production, which is overactive in patients with AD [
12,
13]. Crisaborole was initially approved by the US Food and Drug Administration (FDA) in December 2016 for use as a topical treatment of mild-to-moderate atopic dermatitis in patients ≥ 2 years of age. In March 2020, the FDA approved a supplemental New Drug Application that expanded the use of crisaborole to include children ≥ 3 months of age. Crisaborole was approved in the European Union in March 2020 for the treatment of mild-to-moderate atopic dermatitis in adults and pediatric patients from 2 years of age with ≤ 40% body surface area affected.
Crisaborole was previously approved in Australia, Canada, and Israel. Crisaborole applied twice daily was shown to be effective in patients ≥ 2 years of age with mild-to-moderate AD and was associated with a low incidence of treatment-related/treatment-emergent adverse events (AEs) [
14]. A recent systematic review and network meta-analysis for PDE4is versus vehicle has shown that topical PDE4is are more effective than vehicle alone for patients with mild-to-moderate AD [
15]. Nevertheless, there is a need to compare crisaborole with other topical treatments and to synthesize available evidence from newly published randomized clinical trials (RCTs).
A systematic literature review and a network meta-analysis were performed to evaluate the comparative efficacy and safety of crisaborole versus other topical pharmacologic therapies for mild-to-moderate AD among patients aged ≥ 2 years.
Methods
Systematic Literature Review
Searches were conducted in MEDLINE (Ovid), Embase (Ovid), the Cochrane Collection Central Register of Clinical Trials (CENTRAL; Ovid), and the Database of Abstracts of Reviews of Effects (DARE; Ovid) to identify English language articles published between inception and 10 March 2020 reporting RCTs for evaluation of possible treatments for patients with mild-to-moderate AD. This systematic literature review adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) guidelines [
16,
17]. The search strategies included a combination of controlled vocabulary terms as well as free-text search terms for disease and study designs of interest (Supplement Tables S1–S3). In addition, we hand-searched abstracts from the 2015–2018 scientific meetings of the American Academy of Dermatology and the European Academy of Dermatology and Venereology, as well as bibliographies of included publications and systematic reviews identified in the search.
Identification and Selection of Studies
The review was conducted using a prespecified protocol. Predefined eligibility criteria involved the use of the Population, Interventions, Comparisons, Outcomes, and Study design tool (PICOS; Table S4). Two blinded, independent reviewers examined the citations; any discrepancies were resolved by a third reviewer. The outcome of interest was Investigator’s Static Global Assessment (ISGA) of 0/1 (clear/almost clear) at 28–42 days. Secondary outcomes of interest were AEs.
The relevant information extracted from eligible studies included study design and methods, patient characteristics, intervention details (e.g., dosing, schedule, components of vehicle), and efficacy and safety outcomes, along with time points for outcome assessments. A single reviewer extracted data, and a second reviewer quality-assessed the data accuracy.
Quality Assessments
A risk-of-bias assessment was undertaken using the Cochrane tool, in accordance with the National Institute for Health and Care Excellence (NICE) single technology appraisal guidelines for evidence submissions [
18,
19].
Feasibility Assessments
Prior to analysis, a feasibility assessment determined the availability of evidence and identified potential sources of heterogeneity. All studies were compared with respect to study- and patient-level characteristics, outcome definitions, and time points of evaluation.
A network meta-analysis was performed to obtain relative treatment effects for achievement of ISGA 0/1 at 28–42 days. All analyses were conducted within a Bayesian framework [
20] and involved a 100,000-run-in iteration phase and a 100,000 iteration phase for parameter estimation. All calculations were performed using OpenBugs 3.2.3 [
21]. Models using fixed effects and random effects on treatment effects were explored. Baseline risk regression was used to adjust for differences in vehicle response across RCTs; this was driven by variation in vehicle composition and by heterogeneity in patient characteristics. Baseline risk adjustment indirectly adjusted for heterogeneity in effect modifiers across RCTs [
22]. Class-effects models with baseline risk regression used fixed effects across RCTs but random effects for treatments within class; classes included crisaborole, vehicle, and non-crisaborole treatments. Model fit was explored by comparing the deviance information criterion (DIC) and the posterior mean of the residual deviance for fixed- and random-effects models [
23]. The model with the lowest DIC was considered to be the best fitting. Hazard ratios reflect the “hazard” of response; thus, hazard ratios (HRs) > 1.0 for comparisons between two treatments imply better performance for the first treatment. A detailed description of the statistical methods can be found in the Supplement.
Compliance with Ethics Guidelines
This article is based on previously conducted studies and does not contain any studies with human participants or animals performed by any of the authors. The review adhered to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines.
Discussion
This systematic literature review and network meta-analysis were undertaken to evaluate the comparative effectiveness and safety of crisaborole versus other topical pharmacologic therapies for the treatment of mild-to-moderate AD. In the systematic literature review, no studies were identified that compared crisaborole to other active treatments. Consequently, a network meta-analysis indirectly compared treatments for which no head-to-head trials were available and synthesized available evidence across treatments. No studies of TCSs were identified that reported data on ISGA 0/1; therefore, they were not included in the network meta-analysis.
With respect to efficacy, slightly different versions of the ISGA scale were used among the RCTs. The crisaborole trials used a five-point ISGA scale as an endpoint, whereas other trials evaluated a six-point ISGA scale. Despite this, disease severity measured by baseline ISGA reported across the RCTs seemed to be comparable, with most patients having baseline ISGAs of 2–3 (mild-to-moderate). We have assumed that the “clear” (ISGA = 0) and “almost clear” (ISGA = 1) categories are similar for both scales for analysis purposes because treatment response is defined similarly across both scales. A high response in the vehicle arm in the crisaborole trials was observed with respect to ISGA 0/1, which was greater than that seen in the vehicle arms of most RCTs that evaluated other topical therapies. This suggests that vehicle preparations in some of the RCTs do not have as many therapeutic benefits as those administered in crisaborole RCTs. Heterogeneity in patient characteristics [
22], difference in the season when trials were conducted [
31], and differences in the potency between creams and ointments [
32] may have modified observed treatment effects. Properties of vehicle formulations may affect drug delivery and efficacy, as well as drug tolerance profiles [
33]. Some vehicle excipients have a more pronounced therapeutic effect on the skin and can improve clinical appearance and skin barrier function directly [
33].
There was strong evidence that patients treated with crisaborole or tacrolimus, 0.1% or 0.03%, were more likely to achieve ISGA 0/1 at 28–42 days than those receiving vehicle. Furthermore, there was evidence that patients treated with crisaborole were more likely to achieve ISGA 0/1 at 28–42 days than those treated with pimecrolimus 1%. Although there was weak evidence of a difference between crisaborole 2% and tacrolimus 0.03%, and no evidence of a difference with tacrolimus 0.1% in model 1, all point estimates favored crisaborole.
Our findings are roughly consistent with other reported network meta-analyses on crisaborole in patients with mild-to-moderate AD; however, this may be limited given that other studies did not adjust for baseline risk (variation in efficacy rates for vehicle) [
34]. The Institute for Clinical and Economic Review (ICER) report suggested that pimecrolimus was trending as superior to crisaborole [
34]. However, the results of their analyses showed wide credible intervals and showed no or little evidence of any possible difference in efficacy between treatments. Although the authors of the ICER report noted there was a substantial difference in baseline risk across RCTs regarding treatment response for vehicles, they did not adjust for this in their analyses. The NICE Decision Support Unit recommends regression on baseline response as a means of adjusting for heterogeneity where appropriate [
35], and, in the present case, the credible interval for the interaction term was far from zero, with a slope of − 0.89 and a 95% credible interval of − 1.26 to − 0.47. In the Drug Effectiveness Review Project review, significantly more patients had treatment response with crisaborole than with vehicle [
36]. The authors of this report also did not perform any adjusted analyses. As stated previously, a recent systematic literature review and network meta-analysis for PDE4is that included crisaborole and other PDE4is versus vehicle showed that topical PDE4is, particularly crisaborole, were more effective than vehicle alone [
15].
Safety outcomes were not analyzed by means of a network meta-analysis in the present study because this was deemed inappropriate for a variety of reasons (e.g., difference in outcome definitions, sparsity of data). Therefore, the results for safety were only described qualitatively, and no definite conclusions regarding relative safety of crisaborole versus TCIs could be drawn. Caution should be taken in the interpretation of naive comparisons because no formal comparative (indirect) assessments were made.
The strengths of our study include various key aspects relative to the innovative application of meta-analysis methodologies to address the need for comparative efficacy evidence. The systematic literature review was performed in accordance with published guidelines, and the network meta-analysis was based on well-established Bayesian methodology [
20,
37,
38]. Our systematic review and network meta-analysis was rigorous, used sophisticated statistical models, and reached conclusions that have not been previously documented. Heterogeneity was addressed, where possible, to fulfill the homogeneity assumption necessary for the network meta-analysis. A comprehensive feasibility assessment was conducted a priori, including an evaluation of the clinical heterogeneity between trials that showed that studies were similar for many of the characteristics of interest. Baseline risk regression was performed to adjust for differences in vehicle response and heterogeneity in treatment effects across trials.
There are several limitations to this study. First, the interval for the primary time point of interest was wide at 28–42 days. Because efficacy for interventions may change with prolonged use, this is a potential source of heterogeneity and may have impacted the results for this outcome (i.e., ISGA 0/1 at 28–42 days). To control for this variability in follow-up time, a clog-log model was applied for ISGA 0/1 at the 28- to 42-day time point.
A second limitation refers to the efficacy data being evaluated, given possible confounding factors and the issue that data for some other efficacy outcomes also important in AD were not available. The efficacy difference may not be generalizable to some real clinic settings, as there may be other confounding factors associated with the use and benefits of active treatment in real clinic settings (e.g., access issues). It was also not possible to fully explore all potential confounders by means of subgroup analyses. Further adjustment for differences in baseline characteristics could not be explored using meta-regression techniques because of the limited number of studies available for comparators. Also, some important efficacy outcomes could not be evaluated because of data limitations (e.g., pruritus reduction, quality of life benefit).
A third limitation was that safety outcomes could only be described qualitatively. Network meta-analysis for safety was inappropriate because of sparse data across studies, including differences in outcome definitions used, in reporting of data for comparators, issues with outcomes not reported, and differences in study period.
There are no head-to-head trials comparing crisaborole versus other active treatments. We could only indirectly compare treatments using network meta-analysis. Results should be interpreted with caution and cannot replace a direct head-to-head evaluation.
Acknowledgements
We thank Dr. Jacob Thyssen for his expert clinical interpretations of study results and critical review of the manuscript. We also thank Alison Chapman and Dr. William Romero for their expert review of the manuscript with respect to both technical methods and clinical aspects. We also thank Caroline Cole, Janet Dooley, and Lauren Randall of Evidera for their very valuable contributions in providing medical writing, editing, and formatting support for this manuscript.