Background
It is commonly recognized that pneumonia can increase medical costs and is a particular burden since these patients must significantly utilize medical resources and services [
1]. Further, evidence reveals that pneumonia-related deaths are more prevalent when compared with pneumonia-unrelated deaths [
2]. It is therefore crucial to plan and prepare some form of pneumonia prevention effort in order to better diminish the occurrence of pneumonia among at-risk patients.
On the other hand, schizophrenia, a severe mental disorder that influences more than 21 million people worldwide [
3], can also impose a similar considerable burden on medical expenses [
4]. Schizophrenic patients are reported to be more likely to die early than the general public due to preventable diseases related to cardiovascular disease, metabolic disease and infections [
3]. Schizophrenia can be treated with medications and psychological support [
3]; evidence [
5,
6], however, reports that anti-psychotic medicine is effective but may lead to cases of pneumonia. Considering that people with schizophrenia are usually vulnerable and may face discrimination or violation of their basic human rights [
3], the very real question of how to prevent fatal diseases such as pneumonia that often accompanies their treatment of schizophrenia is therefore a pressing issue that should not be neglected.
Many scholars [
5,
7,
8] have investigated issues related to hospital-acquired pneumonia by traditional statistical models and surely advance our knowledge of pneumonia. Later studies [
9‐
11] further utilized machine learning approaches to investigate pneumonia-related issues. Among those studies, little evidence however utilized machine learning techniques to predict the risk factors of pneumonia specifically among schizophrenic patients. Without deeper knowledge of how schizophrenic patients develop hospital-acquired pneumonia, comprehensive preventative strategies cannot be possibly formulated to counter this serious threat. The study purpose is to build a predictive model for hospital-acquired pneumonia among schizophrenic patients by adopting machine learning techniques. Since machine learning techniques are able to analyze data unsuitable for use in traditional statistical models, different perspectives can be acquired and those findings can further provide support to healthcare professionals’ clinical decision-making.
To date, various studies have analyzed risk factors of contracting pneumonia based on traditional statistical models which require strict assumptions. Their findings revealed that multiple risk factors can influence the occurrence of pneumonia among patients. For example, Mortensen, Coley, Singer, Marrie, Obrosky, Kapoor and Fine [
2] reported that leukopenia was one of the factors that associated with the mortality of pneumonia. Manabe, Teramoto, Tamiya, Okochi and Hizawa [
8] concluded that sputum suctioning, deterioration of the swallowing function, dehydration, and dementia were all risk factors associated with aspiration pneumonia. Gupta, Boville, Blanton, Lukasiewicz, Wincek, Bai and Forbes [
7] identified that mechanical ventilation patients have an increased risk of mortality. Regarding pneumonia-related studies that employed schizophrenic patients as their subjects, despite being efficacious medication for treating schizophrenia, anti-psychotic drugs may however cause unanticipated side-effects for schizophrenic patients. As example, several previous studies have found that anti-psychotic drugs can lead to the development of pneumonia [
5,
12]. Further, drug-drug interaction between anti-psychotic drugs and anxiolytic or anti-convulsive drugs could probably accelerate the occurrence of the pneumonia [
13]. Evidence [
6,
14] even showed that community-acquired pneumonia was associated with taking anti-psychotic drugs in elderly patients. Women were more likely to have a recurrence of pneumonia than men. The potential transmission mechanism underlying the influence of anti-psychotics remained unclear, but cardiopulmonary [
15], agranulocytosis [
16], and abnormal glucose regulation [
17], are reported.
Moreover, Kuo, Yang, Liao, Chen, Lee, Shau, Chang, Tsai and Chen [
5] reported although an increased risk of pneumonia was detected among the use of available anti-psychotics, only clozapine was associated with a dose-dependent increase. Therefore, use and titration of clozapine possesses a higher threat to patients with long-term management of schizophrenia.
Recently, a number of studies have adopted machine learning techniques to predict various issues concerning pneumonia. For example, Cooper, Aliferis, Ambrosino, Aronis, Buchanan, Caruana, Fine, Glymour, Gordon, Hanusa, et al. [
9] applied eight machine learning methods to predict the mortality of inpatients with pneumonia. They found that neural network, hierarchical mixtures of experts, and logistic regression can attain the lowest error rate. Chapman, Fizman, Chapman and Haug [
18] adopted machine learning algorithms including expert-rules, Bayesian network, and decision tree to identify onset pneumonia from thoracic X-ray reports. The performance of three algorithms differs in sensitivity, specificity, and precision; but, it is similar to physicians’ practice. Heckerling, Gerber, Tape and Wigton [
10] integrated neural networks and genetic algorithms for predicting community-acquired pneumonia, and found that inclusion of genetic algorithms can help optimize neural networks algorithms. Caruana, Lou, Gehrke, Koch, Sturm and Elhadad [
19] utilized high-performance, generalized additive models with pairwise interactions to predict the probability of death due to pneumonia. The results reveal that their proposed algorithm outperforms other algorithms such as logistic regression, random forest, and logitboost. Kim, Diggans, Pankratz, Huang, Pagan, Sindy, Tom, Anderson, Choi, Lynch, et al. [
11] developed a machine learning model to classify usual interstitial pneumonia patients, and concludes that their model is feasible for predicting usual interstitial pneumonia occurrence.
A review of the literature reveals a clear gap regarding pneumonia-related studies. Despite a great deal of previous research having been focused on the risk factors of or outcome of pneumonia [
2,
7,
8], less research utilizing machine learning techniques was carried out specifically related to schizophrenic patients. Due to the special characteristics and possible influences of schizophrenia on patients’ health conditions, it is therefore imperative to develop a predictive model for risk factors associated with pneumonia. Such a model can be based on machine learning techniques which can analyze health data while even successfully violating statistical assumptions.
Discussion
With appropriate treatment, schizophrenia can be well-controlled via medications and psychological support [
3]. However, schizophrenic patients are susceptible to pneumonia due to anti-psychotic medications [
5,
12]. It is therefore particularly important to better understand risk factors of pneumonia that accompany the treatment of schizophrenia. Despite a number of studies have investigating risk factors of pneumonia for schizophrenia [
5,
6], literature however revealed that little of those studies which adopted machine learning techniques for prediction. Among the seven adopted machine learning algorithms in our study, RF and C5.0 exhibited the optimal prediction accuracy rather than the remaining algorithms.
In accordance with information gain and gain ratio, we also identified and ranked the top six crucial risk factors including dosage, clozapine use, duration of medication, change of neutrophil count, change of leukocyte count, and drug-drug interaction. Among them, clozapine dosage, clozapine prescription, and prescription duration were major factors said to have influenced the occurrence of pneumonia. Knol, Van Marum, Jansen, Souverein, Schobben and Egberts [
6] found that pneumonia risk was the highest during the first week after initiation of an anti-psychotic medication. In addition, use of typical anti-psychotic (i.e., clozapine) and titration of dosage was also associated with a greater risk of pneumonia [
5]. Therefore, physicians should be careful about setting an appropriate dosage and the duration of medication when prescribing anti-psychotic medications. Further, we also found that the changing of leukocyte and neutrophil count might be indicators affecting the development of pneumonia among schizophrenic patients. Clozapine is notorious for its dangerous adverse effects, for example, neutropenia and agranulocytosis [
35]. Because the immuno-compromised status would decrease the protection against bacteria invasion and transmission in human bodies, physicians should therefore closely monitor leukocyte and neutrophil count in hospitalized schizophrenic patients. Finally, drug-drug interaction is also confirmed as a risk factor for pneumonia when treating schizophrenic patients. Hematological adverse effects including neutropenia and agranulocytosis are enhanced when clozapine was prescribed simultaneously with anxiolytic, anti-convulsant, antimicrobials, proton pump inhibitors and other gastro-intestinal agents [
36]. Physicians are therefore advised to pay close attention to plausible drug-drug interaction among differing drugs whenever prescribing anti-psychotic drugs.
Although the outcome variables and performance metrics are not entirely consistent, it is still worthwhile to make a comparison and contrast between our study and prior pneumonia-related studies that had adopted machine-learning approaches. In their study to predict pneumonia patients’ readmission, Hilbert, Zasadil, Keyser and Peele [
23] adopted a decision tree learner and found that age and gender are considered as important factors which were not determined in our study. The AUCs of CART and C5.0 derived from our proposed models (0.880 and 0.993) are both higher than that of Hilbert, Zasadil, Keyser and Peele [
23] (0.658). By employing k-nearest neighbor method, Chen, Huang, Tan, Chang and Chang [
24] built an abnormal lung sounds diagnostic model with an error rate of 6.8%. The k-nearest neighbor method however revealed the poorest performance in terms of all of the performance metrics found in our study. Khajehali and Alizadeh [
25] adopted a boosting naïve Bayes learner for predicting pneumonia patients’ length-of-stay with their model showing a better than average prediction accuracy of 95.2%. Their model outperformed our model which utilized non-boosting naïve Bayes with a prediction accuracy of 67.5%. Boosting, one of the ensemble methods, is known to improve the performance of a predictive model [
32]. Caruana, Lou, Gehrke, Koch, Sturm and Elhadad [
19] adopted a random forest learner and logistic regression to predict pneumonia risk considerate of from 46 features. The resulting AUCs for random forest and logistic regression were 0.846 and 0.843, respectively. In our study, random forest, a widely recognized learner with well-performance across a broad range of problems [
37], outperformed the study of Caruana, Lou, Gehrke, Koch, Sturm and Elhadad [
19], but the logistic regression in our model performed poorer than that of Caruana, Lou, Gehrke, Koch, Sturm and Elhadad [
19]. Huang, Chen and Hsu [
26] constructed a clinical decision model via a support vector machine learner for predicting pneumonia readmission, and their model achieved 83.9% predictive accuracy with six predictors, including age, gender, number of medication, length of admission, number of comorbidities, and total admission cost. Our proposed model utilizing a support vector machine learner provided a slightly higher predictive accuracy (87.1%) than that of Huang, Chen and Hsu [
26].
Our study is one of the few, to our knowledge, that adopted a machine learning approach in predicting the risk factors of acquiring pneumonia specifically among schizophrenic patients. By utilizing machine learning techniques, risk factors commonly neglected by traditional statistical models can be discovered. Further, by utilizing seven differing classifiers, the findings can be contrasted and compared in order to select the best predictive model for helping preventing schizophrenic patients from the occurrence of pneumonia. Several academic and practical implications can be obtained from our findings.
Our study adopted classification and regression tree, decision tree, k-nearest neighbors, naïve Bayes, random forest, support vector machine, and logistic regression and compared the performance of these classifiers. Random forest, decision tree, and support vector machine outperformed the remaining classifiers. The results demonstrates that machine learning techniques have a high potential for predicting risk factors among hospitalized schizophrenic patients for acquiring pneumonia. Further, the plausible negative effects of class imbalance, a common situation for health data, seemed to be diminished in our model by employing synthetic minority over-sampling technique. Future studies can utilize this technique to neutralize the class imbalance issue.
The potential risks factors utilized by our model are based on clinical evidence, and our model further identified six crucial risk factors predicting schizophrenic patients who may acquire pneumonia. The findings of our study can be utilized as an important reference for psychiatrists by reminding them to pay closer attention to these risk factors whenever diagnosing and then treating schizophrenic patients. One step further, our developed predictive model may be integrated into a CPOE. Psychiatrists can acquire a timely alert regarding the possibility of the onset of pneumonia when they use a CPOE to diagnose and treat schizophrenia in-patients. With information about potential pneumonia risk, early prevention and intervention procedures and decision plans can be formulated in combination with related clinical experiences.
Several limitations should be noted in our study. First, the analyzed cases were extracted from a small-scale hospital, the generalizability of our findings can be limited. Future studies are suggested to gather more cases from a wider variety of hospitals. Secondly, other potential risk factors which were unavailable from the review of medical records can be considered for use in the model.