Introduction
Stroke is one of the leading causes of long-term disability [
1]. Most stroke patients suffer from upper limb hemiparesis that significantly impairs their functional abilities and quality of life [
2]. To help patients restore function, healthcare professionals have to provide rehabilitation interventions that are effective for each patient based on predicted outcomes. Nevertheless, making accurate prediction remains to be a challenging task due to the heterogeneous characteristics and recovery patterns among stroke patients [
3].
With the recent advancement in technology, new techniques have been developed to assist clinicians/therapists in predicting patient recovery. One promising new technique is machine learning. Machine learning utilizes computerized algorithms to optimize prediction. It has several advantages including the ability to process large volumes of data, detection of complex interactions between multiple variables and easy incorporation of new attributes/data into models [
4]. These advantages make machine learning an ideal tool for processing complex healthcare informatics data to develop prediction models [
5].
In stroke, machine learning techniques have been used for predicting motor and functional recovery in acute/subacute stroke patients. For example, Lin et al. evaluated whether machine learning models could predict recovery of activities of daily living in acute stroke patients [
6]. Other studies assessed whether machine learning models could predict motor and/or cognitive improvement in acute/subacute stroke patients [
7‐
9]. Results of these studies were promising with moderate to high accuracy; however, these studies primarily involved inpatient rehabilitation in acute/subacute stroke. Whether the machine learning methods can predict responses of stroke patients to outpatient rehabilitation interventions, such as contemporary task-oriented interventions at chronic stage of stroke remain unknown.
Contemporary task-oriented rehabilitation interventions including the constraint-induced movement therapy (CIMT), bilateral arm training (BAT), robot-assisted therapy (RT) and mirror therapy (MT) are commonly used to address motor dysfunction in chronic stroke patients [
10]. Systematic reviews and meta-analysis studies showed that these contemporary interventions were effective in improving motor function in chronic stroke patients, and should be considered in clinical application [
11‐
14]. Machine learning may be a useful tool to predict motor function improvement after contemporary task-oriented interventions, which may help to identify the responders to these interventions and facilitate practical use.
The purpose of this study was to determine the accuracy and performance of machine learning in predicting clinically significant motor function improvement after contemporary task-oriented interventions in chronic stroke patients and identify important predictors for building machine learning prediction models.
Results
Three most important attributes were identified by the feature selection procedure, which were time since stroke (gain ratio = 0.25), baseline FIM scores (gain ratio = 0.24) and baseline FMA scores (gain ratio = 0.15). The gain ratio for the other 10 attributes was 0. As a result, time since stroke, baseline FIM scores and baseline FMA scores were used for developing the KNN and ANN models.
The accuracy of KNN model with three attributes was 85.42%, precision (PPV) was 0.85, recall (sensitivity) was 0.85, specificity was 0.67, NPV was 0.8, the F1 scores were 0.84, and the AUC-ROC was 0.89. The accuracy of the ANN model with three attributes was 81.25%, precision (PPV) was 0.8, recall (sensitivity) was 0.81, specificity was 0.49, NPV was 0.8, the F1 scores were 0.8, and the AUC-ROC was 0.77. Table
2 summarizes the performance metrics of KNN and ANN models. The performance of KNN and ANN models with the three attributes was better than those with all 13 attributes (Table
2). Table
3 shows the confusion matrix of the test samples of the KNN and ANN models.
Table 2
Model performance metrics of KNN and ANN models with the 3 and 13 attributes
3 attributes | | | | | | | |
KNN | 85.42 | 0.85 | 0.67 | 0.85 | 0.8 | 0.85 | 0.89 |
ANN | 81.25 | 0.81 | 0.49 | 0.80 | 0.8 | 0.79 | 0.77 |
13 attributes | | | | | | | |
KNN | 60.42 | 0.6 | 0.37 | 0.62 | 0.36 | 0.61 | 0.48 |
ANN | 68.75 | 0.69 | 0.51 | 0.7 | 0.49 | 0.69 | 0.71 |
Table 3
Confusion matrix of the test samples (N = 48)
KNN |
Actual: low responders | 7 | 5 |
Actual: high responders | 2 | 34 |
ANN |
Actual: low responders | 4 | 8 |
Actual: high responders | 1 | 35 |
Discussion
Our results showed that machine learning algorithms can accurately predict motor function improvement in above 80% of the participants. The KNN model had 89% chance and the ANN model had 77% chance to distinguish between high and low responders. Furthermore, we identified three most important attributes, which were the time since stroke, baseline FIM scores and baseline FMA scores. The combination of these three attributes made better prediction than all attributes together. The sensitivity, PPV, NPV and F1 scores of the KNN and ANN models were good; however, the specificity was relatively low in the ANN model. The KNN model had overall better prediction performance than the ANN model.
Consistent with the findings of previous studies, our study showed that machine learning methods are feasible and applicable for predicting recovery of stroke patients [
6‐
9]. Furthermore, we expand findings of previous studies by showing that machine learning methods could also make accurate prediction for post-intervention improvements of common task-oriented interventions in individuals with chronic stroke. The prediction performance of our models was comparable to those reported in the studies of acute/subacute stroke. For example, one previous study found a prediction accuracy of 83% using random forest models in acute stroke patients [
9]. Another two studies found model discriminating ability between 77 and 89% using various types of machine learning methods (e.g. support vector machine and logistic regression) in acute stroke patients [
6,
8]. Similarly, in the present study, we identified prediction accuracy of 85% and 81% and discriminating ability of 89% and 77% with KNN and ANN models. Although the prediction performance was similar between ours and previous studies, predicting changes in chronic stroke patients could be a much more difficult task because changes during the chronic period were not as evident as those in the acute/subacute period of stroke. Our study demonstrated that machine learning approaches were still capable of predicting functional changes in chronic stroke.
Three most important attributes were identified, which were time since stroke, baseline FIM scores and FMA scores. Time since stroke indicates the remaining levels of neural plasticity post stroke [
53]. The remaining levels of neural plasticity may affect how the brain re-organizes itself and the resulting neurophysiological processes, such as cortical excitability and interhemispheric inhibition during the task-oriented interventions, which in turn will impact motor function improvement [
53,
54]. Baseline FIM scores indicate the initial functional ability of the participants. Studies have showed that individuals’ FIM scores at admission could predict improvements at discharge and long term care requirement [
55,
56]. Similarly, in the present study, we found that individuals’ FIM scores prior to the task-oriented training can determine post-intervention improvement. As a result, FIM may be a useful outcome to predict recovery in both acute and chronic stroke. Baseline FMA scores indicate initial motor function of the paretic arm. Several prediction model studies have found that baseline motor function was associated with recovery after stroke [
57‐
59]. A recent study also found that motor recovery could be predicted by the initial FMA scores in 5 different subgroups of stroke patients [
60]. Furthermore, contemporary task-oriented interventions emphasized repetitive practice of paretic arm movements to restore motor function. It is thus reasonable to find initial motor function crucial for post-intervention improvements.
These three attributes represent the baseline characteristics and impairment levels of participants, which may be difficult to modify. However, these three attributes could serve as useful indicators that help clinicians to identify chronic stroke patients who may benefit the most from the contemporary rehabilitation interventions. Subsequently, these interventions can be provided to the suitable patients in time. Based on our findings, we recommend clinicians/therapists to record the duration of time post stroke and assess at least the baseline FIM and FMA scores before applying contemporary task-oriented interventions in chronic stroke patients. The information provided by these three attributes can inform clinicians/therapists of the recovery potentials of a particular chronic stroke patient and whether he/she would have better chances to benefit from contemporary task-oriented interventions. Assessing and recording these three attributes, instead of all 13 attributes, may help to save the workload in clinical settings and improve clinical practice efficiency.
Our study demonstrated that the initial level of impairments (i.e., baseline FMA scores) could predict whether participants reached clinically significant improvements after contemporary rehabilitation interventions. This finding was consistent with the “Proportional recovery rule” identified in previous stroke prediction model studies [
61‐
65]. The “Proportional recovery rule” is the idea that most stroke patients will recover approximately 70% to 80% of their potential based on the differences between the initial and the maximum FMA scores [
61‐
65]. For example, Winters et al. found that about 70% of their study patients demonstrated a fixed proportional paretic arm recovery (i.e., 78%) from acute to chronic phase of stroke [
63]. According to this model, the initial FMA scores play a critical role in predicting recovery potentials of stroke patients. However, the “Proportional recovery rule” has been criticized due to the mathematical coupling issue, where the initial FMA scores were part of the dependent (final FMA scores-initial FMA scores) as well as independent variables (maximum FMA scores-initial FMA scores) in a regression model [
66,
67]. In this study, we adopted machine learning methods rather than regression analyses to construct prediction models and we found that initial FMA scores also critical for predicting stroke recovery. Our results along with the others indicate that the initial impairment level may need to be considered during stroke rehabilitation processes [
61‐
65]. Future studies could adopt different types of machine learning algorithms such as support vector machine to examine whether the proportional recovery rule still holds true in different types of machine learning prediction models.
In addition, similar to the “Proportional recovery rule” studies, our machine learning models also showed that there might be non-fitters of the “Proportional recovery rule” and they could not be accurately predicted based on the initial impairment level [
61‐
65]. This could be the reason why the accuracy of our machine learning predication models was around 80%. It may be possible that these non-fitters require more intensive training than the fitters to be able to trigger proportional recovery and benefit from rehabilitation interventions [
68,
69]. Future study could adjust the intensity (e.g., duration and/or frequency) of contemporary rehabilitation interventions to examine if this would impact prediction accuracy.
In addition to the three clinical variables identified in this study, studies have found that other types of predictors were also relevant for predicting stroke recovery in acute, subacute and chronic stroke patients. These predictors included the kinematic variables (e.g., reaction time, movement speed and path ratio) and neurophysiological variables such as motor evoked potentials (MEP). For example, Stinear et al. found that the strength of the shoulder abduction and finger extension in combination with MEP could predict patients’ motor recovery at 3 months post stroke [
70]. Majeed et al. found that kinematic variables such as the speed ratio and numbers of speed peaks contributed to prediction of changes of FMA scores after a three-week intervention in chronic stroke patients [
71]. Future studies could include the three clinical predictors identified in this study (i.e., time since stroke, baseline FMA and FIM scores) as well as kinematic and neurophysiological variables in the ML prediction models to determine if inclusion of various types of variables would optimize prediction performance.
Given that no one algorithm works best for every problem, it is recommended to use multiple machine learning algorithms to examine data [
4]. Following the recommendation, we adopted two common machine learning algorithms, which were the KNN and ANN. Both algorithms can process linear and non-linear relationship within the data and therefore suitable for building prediction models for complicated health informatics data [
72]. We found that both models can predict responses of over 80% of participants and have approximately 80% chance or above to distinguish between high and low responders, indicating that the KNN and ANN algorithms may be suitable tools for predicting post-intervention changes in chronic stroke patients. However, the overall performance of KNN model was better than that of the ANN model. This result was consistent with the finding of two previous studies that examined the performance of KNN and ANN in classifying responses of brainwave/imaging data [
73,
74]. Those studies also identified higher accuracy in the KNN than ANN models. In addition to the accuracy, the specificity was also lower in the ANN than the KNN models although other performance metrics (i.e., sensitivity, AUC-ROC, PPV and NPV) were comparable between these two models. This result was similar to the findings of one previous study that classified brain imaging data using the logistic regression and ANN model [
75]. In that study, the specificity was also low in the ANN model. Two potential reasons may explain why the prediction performance (i.e., accuracy and specificity) of ANN model was weaker than that of the KNN model. First, the sample size of the data may not be optimal for constructing the ANN prediction model. Compared with the KNN, the ANN is a much more complex algorithm and usually requires larger data set [
72,
76]. It is possible that inclusion of more participants may improve the prediction performance of the ANN model. Second, the low specificity of ANN model could be due to fewer numbers of participants in the low responder class in the test data set [
72,
76,
77]. It is plausible that increasing numbers of patients in the low responder class may enhance the specificity of the ANN model. However, in the real world, it may be difficult to obtain a balanced dataset with equal numbers of patients in the low responder and high responder group because only those interventions/treatments that have been demonstrated to be beneficial for most patients will be regularly performed. As a result, the amounts of low responders are often smaller than those of high responders in clinical settings. On the other hand, our result suggests that the KNN algorithm may already be a potentially useful tool for outcome prediction in chronic stroke patients. The high sensitivity with moderate specificity as well as good predictive value and discriminating ability indicates that the KNN model could be considered in outcome prediction of stroke patients in future clinical application.
Study limitations
Six limitations should be considered. First, our outcome prediction was focused on contemporary task-oriented interventions. Future studies could examine whether the identified features of this study could generalize to other types of interventions. Second, we examined predictions of motor function. Future study can explore if machine learning can accurately predict improvements in other domains (e.g., quality of life). Third, our predictions were based on the changes immediately after interventions. Future studies could explore whether machine learning methods can be used to predict retention in the follow-up period. This will help to identify patients that will have lasting improvements after task-oriented interventions. Fourth, there were fewer patients in the low responder than the high responder group, which may potentially affect the prediction performance (i.e., specificity) of the ANN model although other performance metrics, including accuracy, sensitivity, positive/negative predictive value and AUC-ROC were sufficient in the ANN model. Future study could include a larger sample of stroke patients with more low responders and examine if the specificity of the ANN model would improve. Fifth, we used the binary classification method to construct prediction models. Although the performance of our binary classification models was good, it is still possible that multi-level classification method may increase prediction accuracy. Future studies could divide patients into three groups (i.e., the low, medium and high responder group) and determine if the multi-level classification method would increase prediction accuracy in stroke patients. Sixth, we only examined prediction performance of the KNN and ANN algorithms. Future studies could include other types of machine learning algorithms such as decision tree or support vector machine and compare their performance with the KNN and ANN algorithms. This will help to identify the optimal ML algorithm for predicting motor recovery in chronic stroke patients.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.