Background
Methotrexate (MTX) and biologic disease-modifying antirheumatic drugs (bDMARDs) have brought therapeutic success to most but not all patients with rheumatoid arthritis (RA). Optimization of treatment for individual patients and development of novel therapies are eagerly anticipated. A good effort regarding the former is to establish a standard methodology to determine which bDMARDs to prescribe. As the molecular target of each bDMARD is distinct, each effective treatment should link to changes in one or several particular biological processes that are ultimately manifested in a disease state. Despite this understanding and supporting evidence, current concepts of prescription of bDMARDs are unsatisfactory. According to the European League Against Rheumatism (EULAR) recommendations, tumor necrosis factor (TNF) inhibitor, tocilizumab (TCZ), and abatacept (ABT) are in parallel when determining first biologics to use for patients with RA who have shown inadequate response to conventional synthetic DMARDs (csDMARDs) [
1]. Development of a methodology to determine effective therapy using bDMARDs is definitely essential.
Most prediction studies of therapeutic response to bDMARDs using gene expression profiles of blood samples have been focused on a single biologic [
2‐
9], and to date no report of multiple drugs studied in parallel is available. Furthermore, study designs have varied, thus rendering translational studies very challenging.
Difficulty in reproducing gene expression studies has also plagued this area of study, partly due to nonuniformities of study design but also to data processing itself [
10]. Instead of incorporating existing biological knowledge, analysis rarely extends beyond the individual gene level to explain how the biomarker findings are associated with modes of action related to targeted therapies for RA. Furthermore, to establish a robust model using gene expression, it is essential to interpret the results as effects of a collective network of related genes rather than of the gene per se. In this context, Gene Set Enrichment Analysis (GSEA) [
11], which was shown to detect differentially expressed functional gene sets, should be a promising approach.
In this study, to identify therapeutic efficacy markers of three bDMARDs (infliximab [IFX], TCZ, and ABT) targeted at different molecules, we took the aforementioned problems into consideration. We designed a unified test platform in which the subject recruitment criteria, treatment response evaluation, and assay system platform are well defined. GSEA is employed to identify and annotate the gene signatures associated with each biologic. The prediction performance, biological interpretation, and utility of each gene signature are presented.
Methods
Patients and evaluation of effectiveness
The diagnosis of RA in the present study was based on the 1987 revised criteria of the American College of Rheumatology (ACR) for the classification of RA or on the 2010 ACR/EULAR classification criteria. We enrolled patients with RA who responded inadequately to MTX (≥6 mg/week) and were commenced on any one of IFX, TCZ (2008), or ABT (2010) as their first biologic between May 2007 and November 2011 at Keio University Hospital and Saitama Medical University Saitama Medical Center. Biologics were administered according to the guidelines set by the Japan College of Rheumatology (
http://www.ryumachi-jp.com/guideline.html [in Japanese]). Therapeutic outcomes were defined as achieving remission (REM; defined as clinical disease activity index [CDAI] ≤2.8) or not achieving remission (NON-REM) on the basis of CDAI at 6 months of biologic therapy, since other disease activity indexes such as the disease activity score in 28 joints (DAS28) incorporate inflammatory factors such as C-reactive protein or erythrocyte sedimentation rate (ESR), which may overestimate the efficacy of TCZ [
12,
13]. Patients who discontinued biologic therapy by 6 months due to insufficient effects (
n = 5) or adverse events (
n = 1) were classified as NON-REM (Additional file
1). The CDAI of all six cases were >2.8 as determined using the last observation carried forward (LOCF) method. Written informed consent was obtained from all patients in accordance with the Declaration of Helsinki protocol, and the study protocol was approved by the institutional review boards at Keio University and Saitama Medical University.
Before administration of a biologic agent, blood samples were collected in PAXgene Blood RNA tubes [
14] (PreAnalytiX, Hombrechtikon, Switzerland). Total RNAs were extracted using PAXgene Blood RNA kits (PreAnalytiX) following the manufacturer’s instructions. Total RNA quantity and quality were determined using a NanoDrop 1000 spectrophotometer (NanoDrop Products/Thermo Fisher Scientific, Wilmington, DE, USA) and an Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). All RNA samples fulfilled both of the following criteria: RNA integrity >6.5 and optical density at 260/280 nm >1.6.
Gene expression measurements
Cyanine 3-labeled complementary RNAs (cRNAs) were synthesized using QuickAmp Labeling Kits (Agilent Technologies). The cRNAs were hybridized at 65 °C for 17 h to Whole Human Genome 44 K Microarrays (design ID 014850; Agilent Technologies). After being washed, the microarrays were scanned using an Agilent DNA microarray scanner (Agilent Technologies). Intensity values of each scanned feature were quantified using Agilent Feature Extraction software (Agilent Technologies). The raw microarray data are deposited in the National Center for Biotechnology Information Gene Expression Omnibus database under accession number [GSE78068]. We applied rank-based quantile normalization to the raw signal data using R software version 3.0.2. Next, probes were filtered based on preexisting annotation with gene symbol and signal intensity (called “present” in more than 50 samples according to GeneSpring software [Agilent Technologies]). For genes with more than one probe, we adopted the probe that had the highest signal intensity. The final number of probes used for subsequent analysis was 14,718.
GSEA
We employed GSEA to study the molecular biological features of the REM and NON-REM groups associated with each biologic therapy. GSEA is a computational method that determines whether a set of genes defined a priori shows statistically significant, concordant differences between two biological states [
11].
We used GSEA v2.1.0, and input data comprised three sets of data matrices (i.e., 14,718 genes × 140 samples [IFX], 14,718 genes × 38 samples [TCZ], and 14,718 genes × 31 samples [ABT]). Two lists of gene sets were used as sets of genes defined a priori (i.e., Reactome gene sets in the Molecular Signatures Database [
11]), containing 674 pathways and integrated lists of blood cell type-specific expressed gene sets published by Watkins et al. [
15] and Allantaz et al. [
16]. The integrated lists have 16 blood cell type-specific expressed gene sets (Additional file
2). Permutation type was set as “phenotype,” and the number of permutations was 1000. The population gene set for analysis was 14,718, and the metrics for ranking genes were the signal-to-noise ratios. Gene set size filters were in default settings, where the minimum was 15 and the maximum was 500. Gene sets with a nominal
p value <0.05 and a false discovery rate <0.1 were considered significant. Then, we defined “core genes” as the subset of genes that contributed most to the GSEA enrichment score.
Real-time quantitative reverse transcription polymerase chain reaction
Real-time quantitative reverse transcription-polymerase chain reactions (qRT-PCRs) were performed for 11–13 samples of each biologic where total RNAs were adequate. Genes measured were APP, AIM2, NLRC4, MEFV, and BCL2L1 for inflammasomes (signature of IFX); PLEKHG1, AFF3, FCER2, UGT8, and CD22 for specific CD19 (signature of TCZ); and BNC2, CD160, PDGFRB, LIM2, and KIR3DL2 for specific CD56 (signature of ABT). We designed custom RT2 Profiler PCR Arrays (QIAGEN, Valencia, CA, USA), and the assay was performed according to the manufacturer’s instructions. Essentially, 500 ng of total RNA of each sample was used to synthesize complementary DNA using the RT2 HT First Strand Kit (QIAGEN). qRT-PCR was performed using the Applied Biosystems 7500 Fast Dx Real-Time PCR System (Thermo Fisher Scientific, Foster City, CA, USA). The relative expression of each gene was quantified by measuring cycle threshold (Ct) values and normalizing against GUSB.
Calculation of signature score
The scoring system we used, which is clinically applicable to each patient, is shown in Additional file
3. Briefly, each core gene that belonged to a target gene set was standardized using a
z-score transformation based on all 209 patients’ data, and then the average of
z-scores of all core genes was defined as the “signature score” of the gene set for each patient.
ROC analysis
ROC (receiver operating characteristic) analysis was conducted using the signature score compared with REM versus NON-REM category, and AUC of the ROC curve was determined. We applied the same sample group that was used to construct the gene signature. NON-REM was defined as “positive.” The sensitivity, specificity, positive predictive value (PPV), and negative predictive value were determined at the optimal cutoff value (threshold) from the ROC curve. Analysis was performed using R software version 3.0.2.
Statistical analysis
The CDAI of six samples where administration was terminated before 6 months (see "Patients and evaluation of effectiveness" section) was estimated using the LOCF method. The Kruskal-Wallis test, Wilcoxon’s rank-sum test, or Student’s
t test was performed for numerical variables. For categorical variables, Fisher’s exact test was conducted. The associations between CDAI remission at 6 months of biologic therapy and signature scores were evaluated using univariate and multivariate logistic regression analyses (Firth’s penalized likelihood method [
17]). For multivariate analyses, we adjusted for marginally significant (
p < 0.1) univariate factors as shown in Additional file
4. However, owing to the strong correlation with CDAI (in IFX and TCZ analysis) or concomitant steroid use (in ABT analysis), we did not adjust 28 tender joint count (TJC28), 28 swollen joint count (SJC28), patient global assessment (PtGA), physician global assessment (PhGA), DAS28-ESR, and Simplified Disease Activity Index (SDAI) in IFX analysis; SJC28, DAS28-ESR, and SDAI in TCZ analysis; or concomitant steroid dose in ABT analysis.
In this study, p < 0.05 was considered significant. p Values derived from these analyses were not adjusted for multiple testing. All statistical analyses were performed with R software version 3.0.2.
Discussion
Most if not all therapeutic effect prediction studies based on gene expression research have been focused on single rather than multiple biologic agents. Variations in the design of these studies, including recruitment criteria of subjects, evaluation of treatment response, and of assay system platform, represent a huge challenge to combining the studies’ findings in translational studies. Therefore, it is important to develop a unified test platform that allows a level and concomitant comparison among multiple biologic agents and hence anticipation of the therapeutic outcomes. In this study, we have established a clinically practical system to predict the therapeutic effects of three biologics (IFX, TCZ, and ABT). First, we enrolled patients with RA who showed inadequate response to MTX and were administered one of the three biologics for the first time. Second, we used CDAI to evaluate therapeutic effects so as to minimize bias among the three drugs [
12,
13]. Third, total RNAs were taken from whole blood with a well-standardized RNA extraction method (PAXgene blood RNA system [
14]) and analyzed with a single microarray platform (Agilent Technologies).
There was no overlap of gene sets among the three biologics in GSEA, demonstrating that the molecular targets of each biologic are distinct. This finding encouraged us to proceed with this method for comparing other drugs using this platform.
We observed that NON-REM in the IFX group was typically reflected in upregulated gene expression patterns of the inflammasome, which is a multiprotein complex that plays a key role in the production of inflammatory cytokines, such as proinflammatory cytokines interleukin (IL)-1β and IL-18 [
18]. The inflammasome is associated with the pathology of various autoimmune diseases, including RA [
19,
20].
Takeuchi et al. reported that the amount of IFX to administer to a patient could be indicated by the baseline TNF protein level to achieve an effective response [
21]. Moreover, the inflammasome was reported to be activated downstream of TNF signaling [
22‐
24]. Therefore, our observation of upregulated expression of inflammasome-related genes in the NON-REM group of patients indeed reflected the stimulated TNF signal, which could not be attenuated by a standard amount of IFX. As a result, administering a higher dosage of IFX could be a more plausible approach. Differential expression of TNF mRNA between the REM and NON-REM groups was not observed in our analysis, as TNF protein was found mainly in inflammatory joints rather than in whole blood. In the future, it would be interesting to delineate the relationship between expression of inflammasome-related genes and concentration of TNF in the blood.
For TCZ, we found that a B-cell-related gene set is a promising predictive signature because patients who had a low expression of B cells had poor remission rates. TCZ works as an inhibitor of IL-6 receptor signaling by directly targeting soluble and membrane-bound IL-6 receptors. IL-6 is an important B-cell-stimulating factor and induces antibody synthesis [
25], and, in RA pathogenesis, IL-6 induces autoantibody-producing plasma cells [
26]. Furthermore, a subset of B cells, especially memory B cells, were previously reported to decrease when TCZ was administered to patients with RA [
27,
28]. These findings indicate a close relationship between TCZ response and B cells, as also pointed out by our results. The underlying cause differentiating REM and NON-REM in the TCZ group could be the ability to regulate the amount of B cells and/or the functional subtypes of B cells (memory B cells), as reflected by the expression difference.
NK-cell-related genes were significant predictors of NON-REM in the ABT group. The expression of NK cell-related genes was relatively higher in the NON-REM group than in the REM group. As a component of the innate immune system, NK cells are known to regulate activities of dendritic cells, macrophages and T cells [
29]. For example, NK cells were demonstrated to negatively regulate self-responsive T cells in various autoimmune disease models [
30‐
32]. A therapy using ABT, which suppresses T cells, for patients expressing high levels of NK-cell-related genes, which may render activities of T cells suppressed, could be redundant. It is more likely that there are other contributing factors apart from T cells for this type of patient. However, as pointed out by Shegarfi et al. [
33], the role of NK cells related to development of RA is worth further delineation.
Core genes found in this study differ from marker genes identified in other studies (IFX: Lequerré et al. [
2], Tanino et al. [
3], Julia et al. [
4], Stuhlmüller et al. [
5], Cui et al. [
6], and Oswald et al. [
34]; TCZ: Sanayama et al. [
7]) due to different evaluation parameters of therapeutic outcomes used in each study (e.g., DAS28, EULAR criteria), type of samples used (whole blood or peripheral blood mononuclear cells), and sample size. The most important contributing factor could be the analytical approach. Most biological phenomena, especially development of heterogeneous diseases such as RA, are a consequence not of aberrant individual genes but rather of a network of related genes. Therefore, we employed GSEA to capture the biological feature of genes that would provide a robust model to predict the efficacy of biologics. In fact, functional gene set analysis was successful in identification of interferon gene sets as predictors of the efficacy of rituximab [
8,
9].
The performance (i.e., AUCs of ROC curves) of predicting a therapeutic effect (NON-REM) using the signature score for each drug (i.e., inflammasomes, specific CD19, and specific CD56 for IFX, TCZ, and ABT, respectively) were 0.637, 0.796, and 0.768 for IFX, TCZ, and ABT, respectively. At the optimal cutoff value derived from ROC analysis, a notable feature is the high PPVs, which were 83.6 %, 92.3 %, and 94.7 % for IFX, TCZ, and ABT, respectively (Fig.
3). In other words, our approach has a unique feature that could indicate accurately which patients would not likely achieve remission. Although it represents an elimination approach rather than selection of a biologic option, it should be equally effective at a practical clinical level in the context of increasing the probability that a patient would achieve remission. Using this approach, we also discovered a group of patients (Fig.
4, group 1), constituting about 20 % of patients in this study, who were not likely to achieve remission with either biologic (remission rate is a merely 11.9 % [5 of 42]). Future studies exploring biologics other than the three in the present study or differentiation analysis to predict achievement of low disease activity are essential. However, since ROC analysis was conducted using the same sample group that was used to construct the gene signatures, an overfitting problem might occur. It is essential to validate our results in independent cohorts in the future.
While this was an observational study conducted in an actual clinical setting, it also inevitably has limitations, which include bias in clinical background and an unmatched number of samples collected for the studied biologics. In fact, IFX was approved long before TCZ and ABT in Japan, which is self-explanatory why the IFX group outnumbered the TCZ and ABT groups. We believe that the choice of actual clinical settings also led to the bias in clinical background between the three biologics, such as that the ABT group was older than the other two and coadministration with MTX was more likely associated with the IFX group. An independent cohort study in which clinical background is matched could provide a clearer answer. Another limitation was the significant difference in baseline clinical background between the REM and NON-REM groups. Although we have shown that gene expression signature score remains significant after adjusting the baseline clinical background, again, the small number of samples of in the ABT and TCZ groups might not be absolutely persuasive. We are planning to increase the number of samples for validation. Last but not least, as the specimens used in this study contained RNA extracted from whole blood, which is composed of various types of blood cells, it is not clear if the gene expression signatures were just a reflection of different amounts of components of blood cells. This remains to be addressed by analyzing immunophenotyping data in the future.
Abbreviations
ABT, abatacept; ACPA, anticyclic citrullinated peptide antibodies; ACR, American College of Rheumatology; bDMARD, biologic disease-modifying antirheumatic drug; CDAI, Clinical Disease Activity Index; cRNA, complementary RNA; CRP, C-reactive protein; csDMARD, conventional synthetic disease-modifying antirheumatic drug; Ct, cycle threshold; DAS28, Disease Activity Score in 28 joints; ESR, erythrocyte sedimentation rate; EULAR, European League Against Rheumatism; FDR, false discovery rate; GSEA, Gene Set Enrichment Analysis; IFX, infliximab; LOCF, last observation carried forward; MTX, methotrexate; NK, natural killer; IL, interleukin; NES, normalized enrichment score; NOM, nominal p value; NON-REM, patients without Clinical Disease Activity Index remission at 6 months of biologic therapy; NPV, negative predictive value; PhGA, physician global assessment; PPV, positive predictive value; PtGA, patient global assessment; qRT-PCR, real-time quantitative reverse transcription-polymerase chain reaction; RA, rheumatoid arthritis; REM, patients with Clinical Disease Activity Index remission at 6 months of biologic therapy; RF, rheumatoid factor; RNA pol II, RNA polymerase II; SDAI, Simplified Disease Activity Index; SJC28, 28 swollen joint count; TCZ, tocilizumab; TJC28, 28 tender joint count; TNF, tumor necrosis factor.
Acknowledgements
We thank the patients and staff members who participated in this study. This work was supported by subsidies from Japan’s New Energy and Industrial Technology Development Organization and in part by funding from DNA Chip Research Inc.
Competing interests
SN, HI, YH, CRL, YI, and KM are members of DNA Chip Research Inc. RM is a chief executive officer of DNA Chip Research Inc. KS has received research grants from Bristol-Myers Squibb and Eisai Co., Ltd. HK has received grants from AbbVie G.K., Astellas Pharma, Chugai Pharmaceutical Co., Ltd., Eisai Co., Ltd., Mitsubishi Tanabe Pharma Co., Pfizer Japan Inc., Santen Pharmaceutical Co., Ltd., and Takeda Pharmaceutical Co., Ltd. HK has received speaking fees from AbbVie G.K., Astellas Pharma, Bristol-Myers K.K., Chugai Pharmaceutical Co., Ltd., Eisai Co., Ltd., Janssen Pharmaceutical K.K., Mitsubishi Tanabe Pharma Co., Nippon Kayaku Co., Ltd., Pfizer Japan Inc., Takeda Pharmaceutical Co., Ltd., and UCB Pharma. HK has received consulting fees from AbbVie G.K., Eli Lilly Japan K.K., Novartis Pharma K.K., Sanofi Pharma, and Nippon Kayaku Co., Ltd. KA has received grants from Astellas Pharma, Chugai Pharmaceutical Co., Ltd., Mitsubishi Tanabe Pharma Co., and Pfizer Japan Inc. KA has received speaking fees from AbbVie G.K., Astellas Pharma, Bristol-Myers K.K., Chugai Pharmaceutical Co., Ltd., Mitsubishi Tanabe Pharma Co., and Pfizer Japan Inc. TT has received grants from AbbVie G.K., Asahi Kasei Pharma Corp., Astellas Pharma, Bristol-Myers K.K., Chugai Pharmaceutical Co., Ltd., Daiichi Sankyo Co., Ltd., Eisai Co., Ltd., Mitsubishi Tanabe Pharma Co., Pfizer Japan Inc., Santen Pharmaceutical Co., Ltd., SymBio Pharmaceuticals Ltd., Takeda Pharmaceutical Co., Ltd., Taisho Toyama Pharmaceutical Co., Ltd., and Teijin Pharma Ltd. TT has received speaking fees from AbbVie G.K., Astellas Pharma, Bristol-Myers K.K., Celltrion, Inc., Chugai Pharmaceutical Co., Ltd., Daiichi Sankyo Co., Ltd., Eisai Co., Ltd., Janssen Pharmaceutical K.K., Mitsubishi Tanabe Pharma Co., Nippon Kayaku Co., Ltd., Pfizer Japan Inc., and Takeda Pharmaceutical Co., Ltd. TT has received consulting fees from AbbVie G.K., Asahi Kasei Medical K.K., Astra Zeneca K.K., Bristol-Myers K.K., Daiichi Sankyo Co., Ltd., Eli Lilly Japan K.K., Mitsubishi Tanabe Pharma Co., Nippon Kayaku Co., Ltd., and Novartis Pharma K.K. The authors declare that they have no nonfinancial competing interests.