Introduction
In the United States (US), atrial fibrillation (AFib) is projected to affect 12 million people by 2030 [
1]. AFib has been recorded as the primary diagnosis for more than 454,000 hospitalizations and contributed to more than 158,000 deaths annually [
2‐
4]. The coexistence of cancer among patients with AFib increases incidence of adverse events such as ischemic stroke, venous thromboembolism (VTE), bleeding, and death compared with AFib patients without cancer [
5‐
8]. Current management of patients with AFib and cancer with oral anticoagulants (OACs) remains suboptimal due to insufficient evidence regarding risk assessment and treatment optimization from clinical practice guidelines [
9].
CHA
2DS
2-VASc score, a composite score of congestive heart failure (1 point), hypertension (1), age ≥ 75 (2), diabetes mellitus (1), prior stroke, TIA, or thromboembolism (2), vascular disease (e.g. peripheral artery disease, myocardial infarction, aortic plaque) (1), age 65–74 years, and sex category (1), has been used to evaluate of risk of stroke in patients with AFib [
10,
11]. The clinical guidelines recommend OACs for patients with CHA
2DS
2-VASc scores ≥ 2 [
11,
12]. However, CHA
2DS
2-VASc score is not highly predictive in patients with AFib and cancer [
13,
14]. HAS-BLED score has been widely used for risk of bleeding stratification. The HAS-BLED is calculated by the presence of hypertension (1), abnormal renal/liver function (1 + 1), stroke (1), bleeding tendency or predisposition (1), labile INR for patients taking warfarin (1), age ≥ 65, drugs (concomitant aspirin or NSAIDs) or excess alcohol use (1 + 1) [
15]. The 2020 European Society of Cardiology (ESC) guideline suggests a score of ≥ 3 indicates “high risk” [
12]. However, it is not recommend against the use of anticoagulants, but caution and regular monitoring after treatment initiation are needed [
12]. Nonetheless, the usefulness of HAS-BLED in cancer patients are inconclusive because cancer is an independent risk factor of bleeding among patients with AFib [
16]. Pastori et al. compared the performances of multiple bleeding risk scores among cancer patients and found that HAS-BLED was not highly predictive of major and gastrointestinal bleeding [
17].
Therefore, it is an urgent need to develop new risk assessment tools for stroke and bleeding in patients with AFib and cancer. Traditional risk assessment tools such as CHA
2DS
2-VASc and HAS-BLED are simple and easy for implementation among clinicians because they are linear combinations of patients’ diseases and conditions. However, when the relationships between patients’ characteristics and outcomes become more complicated, these tools may not perform well in patients with AFib and cancer. Recently, machine learning (ML) algorithms have been increasingly used to support clinical decision-making such as or identifying patients with dementia in primary care, anticoagulation monitoring, and measuring pretreatment quality of care before treatment in patients with hepatitis C [
18‐
20]. Compared with conventional regression-based methods, ML models are able to learn from the data when the association between predictors and outcome variables is not linear. ML models have overperformed parametric regressions in handling high-dimensional data and interactions between variables in a complex data structure [
20‐
22].
In this study, we developed and validated ML algorithms to predict risk of stroke and bleeding events among patients with AFib and cancer, using US cancer registry and administrative claims linked datasets.
Discussion
Our study is among the first studies that developed and validated ML algorithms to predict adverse outcomes exclusively for patients with AFib and cancer. In this cohort study, we demonstrated that incorporating ML algorithms into SEER-Medicare data can be a promising tool to predict short-term (1 year) risk of stroke among patients with AFib and cancer. Among older adults with cancer who were newly diagnosed with AFib, clinicians can collect patients’ demographics, socioeconomic status, medical history, and medication history from routine medical records and/or patient survey, then leverage this tool to predict patients’ risk of stroke. Our ML algorithms help clinicians identify high-risk patients and facilitate treatment decision (i.e., medication or non-pharmacological intervention) among older adults with AFib and cancer across the US.
RF outperformed other ML models in all metrics (AUC, sensitivity, specificity, and F2 score) for ischemic stroke. Although widely accepted as a risk assessment tool for stroke among patients with AFib, CHA
2DS
2-VASc score failed to achieve high performance in patients with AFib and cancer, especially in new onset AFib [
9,
14,
49]. In this study, CHA
2DS
2-VASc score performed better than ML models, except for RF in identifying patients with ischemic stroke, however, CHA
2DS
2-VASc score could not differentiate those with lower risk (low specificity). In fact, 91.9% of patients in this study have CHA
2DS
2-VASc ≥ 2 and would have been recommended for OACs according to current guidelines [
11,
12]. The major limitation of CHA
2DS
2-VASc is the absence of cancer indicator, which has been suggested as an independent risk factor of stroke [
50,
51]. A recently published study suggested the incorporation of cancer to CHA
2DS
2-VASc score to improve predictability of the original score [
52]. Indeed, CHA
2DS
2-VASc score is the linear combination of conditions in prediction of stroke [
10]. In the presence of cancer, the relationship between patient characteristics and ischemic stroke may become more complicated (i.e., non-linear), it is not surprising that CHA
2DS
2-VASc score failed to achieve high performance. In our study, linear models such as elastic net and SVM had lower performance metrics compared with non-linear models such as RF and XGBoost. Similar to CHA
2DS
2-VASc, we found prior stroke was among most important features in all ML algorithms. However, our approach incorporated a comprehensive set of patients’ characteristics. For example, patients’ socioeconomic status (household median income and education level) and cancer characteristics (cancer type, active cancer status) were ranked among top features in RF and XGBoost. The importance of these features highlighted contributions of health disparities and cancer characteristics in stroke prediction. The inclusion of these variables may be useful in identifying high-risk patients [
53]. However, it is also noticed that tree-based models may inflate the impact of continuous features in their prediction [
45]. Clinicians may consider initiating OACs for those who are at high risk of stroke identified by our RF algorithm.
Traditional tools such as HAS-BLED or HEMORR
2HAGES showed poor predictability in patients with cancer [
16,
17,
54]. Our ML algorithms also failed to obtain high performance metrics in prediction of major bleeding. Such poor performance suggested complex interactions between patients’ characteristics and outcomes in the presence of cancer. First, although we obtained additional cancer characteristics compared with traditional risk scores, the performance was not improved [
55]. This may suggest that our models failed to capture important features in prediction of major bleeding. In fact, genetic factors and disease severity were not available in SEER-Medicare data and dynamic features (i.e., cancer progression, new diagnosis of diseases) were not included in the models due to complexities. Similar to previous risk scores, we found that bleeding history was an important factor in prediction of subsequent major bleeding [
17,
55]. Second, we excluded patients who have already initiated OACs before AFib diagnosis and those who initiated AFib during follow-up because OACs may increase risk of bleeding. As a result, only 1.2% patients in our cohort experienced bleeding events during follow-up and this created a severe imbalance classification problem for our ML algorithms and may lead to poor predictability [
56]. Future studies may expand the outcomes to other types of bleeding (i.e., intracranial bleeding, gastrointestinal bleeding, or other non-critical site bleeding) to improve the performance and the clinical utility of the algorithms.
In our study, SMOTE resampling approach did not improve the performance of the model. In the training set, SMOTE created new synthetic ‘stroke’ individuals from interpolations of the original, real ‘stroke’ cases [
48]. Studies have shown that SMOTE-like methods could improve the performance of weak classifiers such as SVM, decision tree [
57]. In our study, SMOTE improved AUCs in SVM only. Another limitation of SMOTE is that it resulted in poorly calibrated models where the probability of the minority class (stroke) was strongly inflated demonstrated by Brier score.
Our study is subject to some limitations. We were unable to capture some important variables in the ML models (i.e., BMI, genetic factors, frailty, and health behaviors—not available in SEER-Medicare). Socioeconomic factors such as household income and education level are available on the aggregate area level (Census tract) but not individual level. In addition, our algorithms did not incorporate the impact of some post-baseline predictors (i.e., treatment dosage, adherence, recent CHA
2DS
2-VASc and HAS-BLED scores, recent use of NSAIDs, and other time-varying variables such as interactions between oral anticoagulants between OACs and antineoplastic agents) [
58]. Our study is applicable to the study period 2011–2019. From 2020, the presence Covid-19 has worsened outcomes of patients with AFib or cancer patients and has negatively impacted health services, delayed and reduced cancer screening and diagnosis in the United States [
59‐
63]. Therefore, the model should be updated and validated incorporating Covid-related factors during and after the pandemic. In addition, our ML algorithms could not further stratify the risk of stroke and major bleeding (i.e., low, moderate, high, or very high). Future study may leverage advanced ML algorithms such as survival ML in predicting the probability of adverse events after 1 year or extended follow-up time. Last, the generalizability of our ML models to other populations may be limited (i.e., commercial insurance, anticoagulated patients, or patients with other cancer types).
Acknowledgements
This study used the linked SEER-Medicare database. The interpretation and reporting of these data are the sole responsibility of the authors. The authors acknowledge the efforts of the National Cancer Institute; Information Management Services (IMS), Inc.; and the Surveillance, Epidemiology, and End Results (SEER) Program tumor registries in the creation of the SEER-Medicare database. The collection of cancer incidence data used in this study was supported by the California Department of Public Health pursuant to California Health and Safety Code Section 103885; Centers for Disease Control and Prevention’s (CDC) National Program of Cancer Registries, under cooperative agreement 1NU58DP007156; the National Cancer Institute’s Surveillance, Epidemiology and End Results Program under contract HHSN261201800032I awarded to the University of California, San Francisco, contract HHSN261201800015I awarded to the University of Southern California, and contract HHSN261201800009I awarded to the Public Health Institute. The ideas and opinions expressed herein are those of the author(s) and do not necessarily reflect the opinions of the State of California, Department of Public Health, the National Cancer Institute, and the Centers for Disease Control and Prevention or their Contractors and Subcontractors.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.