Background
Methotrexate (MTX) is the preferred initial disease-modifying antirheumatic drug (DMARD) for rheumatoid arthritis (RA). While MTX is the only drug needed to control RA disease activity for many patients with RA, up to 50% of patients respond inadequately to MTX and require additional treatments [
1]. A 3- to 6-month trial of MTX treatment is generally recommended before a decision is made regarding MTX efficacy [
2]. This delay can result in missing the window of opportunity for effective treatment of RA disease activity and unnecessary exposure to potential MTX-related side effects. Response to therapy in the first 6 months of RA diagnosis correlates with long-term outcomes [
3], offering a strong rationale for identifying early predictors of treatment response.
Longer RA disease duration, higher baseline disease activity score including 28 joints (DAS28), female sex, younger age, smoking, and alcohol consumption have been associated with a lower likelihood of treatment success with MTX in observational studies and clinical trials [
4‐
10]. These predictors vary depending on the definition of treatment response, i.e., achieving a state of remission or low disease activity versus absolute improvement in disease activity metrics, and there is a lack of a uniform and clinically useful prediction model of treatment response to MTX [
11].
The use of machine learning (ML) in data analysis to inform individualized clinical decision-making and improve patient outcomes is on the rise across the spectrum of medical specialties, including rheumatology [
12‐
14]. A recent large observational study of MTX-naïve patients with early RA (
n > 5000) showed that ML methods integrating baseline clinical data did not significantly improve the prediction of MTX treatment persistence at 12 months compared to manual modeling, and the highest area under the curve (AUC) for least absolute shrinkage and selection operator (LASSO) regression was only 0.67 [
15]. Smaller observational studies (
n = 355) showed that the performance of ML algorithms (LASSO models AUC 0.76) was not superior to the logistic regression (AUC 0.77) in predicting DAS28 > 3.2 at 3 months of treatment in patients with RA who used MTX as monotherapy or in combination with other DMARDs [
16]. Whether ML can provide a robust and clinically useful prediction of response to MTX monotherapy in the first months of treatment in patients with early RA using uniformly collected baseline demographics and clinical data has not been investigated in large patient populations.
To address this knowledge gap in RA management, we applied ML methods to randomized clinical trial (RCT) data to (1) algorithmically identify the classes of patients with RA and distinct trajectories of their DAS28 erythrocyte sedimentation rate (DAS28-ESR) from baseline to week 24, (2) identify the clinical predictors of belonging to one versus the other class(es) with external validation of the model, and (3) identify the predictors of achieving DAS28-ESR ≤ 3.2 at 24 weeks among patients with incomplete response, i.e., DAS28-ESR > 3.2 at week 12.
We hypothesized that patients with RA have distinct trajectories of response to MTX based on DAS28-ESR in the first 24 weeks of treatment and that response to MTX can be reliably predicted using a combination of baseline clinical data with the highest predictive importance, with validation in an independent dataset.
Discussion
Approaches for dealing with vast heterogeneity in response to MTX among individual patients with RA are insufficiently addressed in the current treatment guidelines, and systematic patient-tailored tools to personalize early RA management are lacking [
27,
28]. This is one of the first studies using ML methods to identify the latent trajectories of DAS28-ESR over 24 weeks in new users of MTX with high RA disease activity at baseline. The clinical phenotype which we defined as “good responders” comprising lower baseline DAS28-ESR score and its individual components, positive ACPA, and lower baseline HAQ score ranked in the order of predictive importance. “Good responders” at 24 weeks accounted for 66% of all patients, consistent with previously reported rate of response to MTX at this time point [
29]. The finding that lower baseline disease activity and better functional status at baseline are predictive of “good responders” to MTX is not unexpected and is concordant with previous studies [
11,
29,
30]. In addition, we have provided cut points and a matrix prediction model for a good response to MTX: patients with DAS28-ESR ≤ 7.4, positive ACPA, and HAQ ≤ 2 at baseline have an 80% probability of being good responders, while having DAS28-ESR and HAQ above these cutoffs in combination with ACPA negative status results in only 33% probability of good response. Consistent with the prediction model, DAS28-ESR was the most important predictor in multivariate risk profiling. These findings can inform clinical decision-making and patient-physician discussions about the likelihood of treatment success in patients initiating MTX. By quantifying the probability of response to MTX using baseline clinical data, our findings may facilitate consideration of the use of biologics or targeted synthetic DMARDs in treatment-naïve patients with RA and poor prognostic factors, a scenario of RA treatment which is not specifically addressed in the current ACR or EULAR guidelines [
27,
28]. Further studies employing our model will be needed to assess the clinical utility of our model and its implications for clinical practice.
The components of our model were very similar to the Rheumatoid Arthritis Medication Study (RAMS) [
29], except in that study lower DAS28 was paradoxically associated with non-response to MTX. This was likely due to the definition of non-response in the RAMS study requiring at least 0.6 units decline in DAS28 score which is more likely to happen in patients with higher DAS28 at baseline. In our study, the classes of patients were defined algorithmically based on DAS28 trajectory, and the prediction was for the class of responders rather than for the change in DAS28, consistent with the association between lower DAS28 and achieving a “state” outcome (i.e., remission or low disease activity) in prior studies [
11].
Unlike the previous studies which evaluated the outcomes at 12 months [
11,
26], we focused on the earlier time points (i.e., 3 and 6 months) as the most critical “window of opportunity” for decision-making regarding the future management plan in early RA, consistent with the American College of Rheumatology recommendations [
31]. Indeed, RA disease duration was one of the top predictors in our random forests, consistent with the established knowledge that delay in treatment is an adverse prognostic factor in achieving treatment targets in RA [
4,
5,
9].
In line with our finding of ACPA positivity as a predictor of the “good responders” class, higher likelihood of early response (i.e., 4 months) to treatment with MTX in seropositive patients with abundant autoantibodies including ACPA was reported in the induction therapy with MTX and prednisone in RA or very early arthritic disease (IMPROVED) study [
32]. However, no such association was found for response at 1 year in the IMPROVED study. In the RAMS, not being RF-positive was associated with non-response to MTX at 6 months [
29]. It has been suggested that the presence of multiple autoantibodies at baseline may reflect a more active autoimmune response which is more susceptible to suppression by MTX in the initial stages, but not in later stages [
32]. Indeed, ACPA positivity has been associated with lower rates of drug-free remission in RA in several large studies [
33‐
36], potentially indicating the persistence of a population of ACPA IgG-producing autoreactive B-cells that is resistant to therapy and accounts for the inability to achieve drug-free remission in RA in the long run [
32]. Given that ACPA and RF seropositivity can be helpful in informing response to treatment with other antirheumatic medications in RA (e.g., abatacept and rituximab), confirming the value of differential prediction of early response to MTX treatment in RA by serostatus in future studies may help to further refine the approach to the management of early RA [
37,
38].
Among the individual components of DAS28, TJC28, ESR, PtGA, and SJC28 were among the top five predictors of “good responders.” Concordantly, higher TJC28 was an independent predictor of non-response to MTX at 6 months in the RAMS [
29]. DAS28, TJC28, HAQ, and ESR were among the top predictors of insufficient response to DMARD treatment using LASSO and random forests in a recent study from The Netherlands [
16]. While at least one-third of patients in the study by Gosselt et al. used MTX in combination with sulfasalazine or hydroxychloroquine, our study included patients on MTX who were not on other DMARDs.
All four RCTs included in the study used an up-titration scheme for MTX, maximizing the dose to 20–25 mg/week, thus decreasing the possibility of non-response due to underdosing. Good responders were more likely to use glucocorticoids at baseline, concordant with the data that MTX monotherapy with glucocorticoid bridging can be clinically beneficial in achieving treatment response in RA [
39].
Sociodemographic characteristics were among the predictors in random forests models but were not retained in the LASSO models. Sociodemographic and economic parameters have been associated with the persistence of MTX treatment in prior studies using ML methods, but these models are not directly comparable to the prediction of response to MTX (i.e.., MTX efficacy) in our study [
15]. There is some ambiguity in the predictive role of sociodemographics for MTX response, mainly with regard to the predictive role of the female sex which was found to be associated with lower likelihood of response in some but not other studies [
5,
9,
29,
40,
41]. This discrepancy may be at least in part due to the use of disease activity metrics including ESR without accounting for age- and sex-specific cutoffs for ESR which may bias the assessment of treatment response. In this study, we were not able to retrieve CRP measures for 12 and 24 months but used DAS28-ESR which was an outcome measure used in the included RCTs. Using disease activity metrics that do not include ESR in future studies may help further refine our understanding of the predictive value of sex in response to antirheumatic treatments.
In this study, we successfully performed external validation of our models which has not been done in the previous studies [
16,
29]. Importantly, models with individual components of DAS28-ESR had similar performance to models with DAS28-ESR score, supporting the construct validity of the DAS28-ESR measure. The modest discrimination value (AUC 0.79) of our LASSO models is non-inferior to the previous models including clinical predictors of response to MTX in RA [
16,
29] and dictates the need for additional biomarkers aiming at the improved performance of individualized predictive models. Studies are underway by our group to augment clinical predictors with genomic, metabolomic, and microbiome data [
42,
43].
Among patients who had DAS28-ESR > 3.2 at 12 weeks, the majority (81%) did not achieve DAS28-ESR ≤ 3.2 at week 24 which is concordant with studies showing that in over 75% of patients the trajectory of response or non-response to MTX is consistent between 3 and 6 months [
44]. Extending this prior knowledge, we have identified and ranked in the order of importance predictors of achieving low disease activity by DAS28-ESR at 24 weeks among those who did not achieve DAS28-ESR at 12 weeks. A steeper decline in DAS28-ESR (i.e., at least 1 point improvement in DAS28-ESR) from baseline to week 12 has been identified as the top predictor of response in this group, which can be used in clinical decision-making and discussions regarding the likelihood of achieving low disease activity at 24 weeks among patients who have not achieved low disease activity at 12 weeks. This finding applies to patients who continue MTX monotherapy for the first 24 weeks of treatment. Further studies are needed to understand the relevance of this initial improvement in DAS28-ESR for the prediction of response to therapy escalation at week 12. Other predictors of DAS28-ESR ≤ 3.2 at week 24 in this group included lower baseline DAS28-ESR, younger age, positive ACPA, shorter RA duration, and lower baseline HAQ, in line with our main prediction model.
There are several limitations to our study. First, we had no information about some risk factors that have been previously linked to lower likelihood of treatment response, i.e., low socioeconomic status, smoking, obesity, mental and physical comorbidities, and non-adherence to treatment [
29,
45,
46]. The addition of these risk factors would be expected to further refine the prediction. Second, data for some variables (primarily ACPA and glucocorticoid use) were missing for a proportion of patients. Imputation of the missing values in predictor data and addition of missing data indicator variables in the statistical models can help minimize this shortcoming. The missing data indicator did not appear to be among the significant predictors in the main LASSO models with DAS28 and its components but was present in the random forests models, requiring caution in interpreting the results. Third, most patients included in this study had high RA disease activity, reflective of the RCT study population. Thus, the results may not be generalizable to patients with low-moderate RA disease activity at baseline. Future studies identifying predictors of response to MTX in patients with low-moderate RA disease activity should help to further inform early RA management.
The main strengths of the study include the use of high-quality longitudinal data of treatment-naïve patients with active RA from 4 RCTs, forming a large training dataset with an independent testing set for external validation. We used a data-driven ML approach combining sociodemographic and clinical data to identify distinct patient trajectories based on DAS28-ESR. The model included clinician-friendly, easily available clinical measures and was externally validated in an independent test set.
In conclusion, we have developed and externally validated a prediction model for response to MTX monotherapy within 24 weeks in DMARD-naïve patients with high RA disease activity at baseline, providing variably weighted predictive clinical features and defined cutoffs for clinical decision-making. The trajectory of DAS28-ESR change over 24 weeks in patients with high RA disease activity at baseline who are starting MTX can be predicted by baseline DAS28-ESR, ACPA status, and HAQ score. These parameters should be considered as part of the clinical decision-making process when initiating MTX in DMARD-naïve patients with RA. For example, a patient with RA who is ACPA-positive and has HAQ ≤ 2 and DAS28-ESR ≤ 7.4 would have a good predictive probability of achieving a good response to MTX therapy. In contrast, a patient with DAS28-ESR > 7.4, who is ACPA-negative and has HAQ > 2, would be predicted to have a poor response to MTX treatment, and a more aggressive treatment regimen (e.g., addition of a biologic agent) can be discussed. Patients with over 1 unit decline in DAS28-ESR within the first 12 weeks of treatment who have not achieved low disease activity by week 12 may be more likely to achieve low disease activity at 24 weeks. This information may allow physicians to tailor treatment approaches based on the likelihood of treatment response and can help improve RA disease outcomes and patient satisfaction by better managing patients’ expectations.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.