Key findings
This is, by our knowledge, the largest study investigating the usefulness of common severity scores in predicting mid-term mortality in patients with spontaneous intracerebral hemorrhage treated in ICUs. Of the commonly used intensive care severity scores, both the APACHE II- and SAPS II-based models showed good discrimination, whereas SOFA displayed only satisfactory discrimination. In regard to calibration, only the SAPS II-based model showed satisfactory calibration whereas the other models showed poor calibration. In the post-hoc analyses, the discrimination of the SAPS II and APACHE II scores without their age and GCS score components markedly lowered their discriminative power. Thus, the main predictive ability of SAPS II and APACHE II in ICH patients comes from the strong predictive effect of age and the GCS score. This is strengthened by the study’s main finding, which is that compared to a simple prognostic model, including only age and GCS score, the more complex ICU scores were of no additional prognostic value. It is not surprising that SOFA did not match the predictive performances of APACHE II and SAPS II (or the simple age and GCS score model) as SOFA was originally intended as a descriptive measure of organ failure and not as a predictive measure. Thus, for ICH patients treated in the ICU, there is nothing to favor the use of previous complex ICU scoring systems, as age and GCS alone adequately predict mortality. Furthermore, abstracting age and GCS score is much more time-efficient than abstracting the complex intensive care scoring systems.
Interestingly, adding pre-admission functional status to the reference model (including age and GCS) did not improve the prognostic performance. This is somewhat surprising, as a recent study showed pre-admission functional status to be a strong independent predictor of outcome in general ICU patients [
28]. Our results might indicate that in ICH patients, the injury severity itself is more important in determining patient prognosis than pre-admission functional status. Yet, only 9% of included patients were dependent in daily functions prior to admission. Thus, the effect of this variable is probably underpowered, which probably explains why it did not add any predictive power. Furthermore, included patients that were dependent prior to admission probably represent a selected cohort that have been considered to have a reasonable prognosis and therefore admitted to the ICU, increasing the likelihood of a type II error. Thus, any foregone conclusions regarding the association between pre-admission functional status and outcome cannot be drawn from our study.
Comparison with previous studies
Clinical studies concerning the common intensive care severity scores in outcome prediction after ICH are limited, especially with regards to mid- or long-term mortality prediction. The results of our study are in concordance with previous studies. In a prospective study including 90 patients with acute stroke,
Handshu et al. showed that the prognostic performance of GCS was almost equal to SAPS II in both 90-day (AUC 0.68, AUC 0.75 respectively) and 365-day mortality prediction (AUC 0.73, AUC 0.77 respectively) [
13]. However, the study included both hemorrhagic (54%,
n = 49) and ischemic stroke (46%,
n = 41) patients and, thus, the results may be biased, as these are two very different patient populations.
Huang KB et al. showed in a retrospective single-center study, including 75 patients, that APACHE II, SAPS II and ICH score predicted 30-day mortality well in patients with primary pontine hemorrhage (AUC for APACHE II 0.92, AUC for SAPS II 0.89, AUC for ICH score 0.84) [
29]. Yet similarly to our study, the discriminative power of the GCS score (AUC 0.88) did not differ substantially from these more complex scoring systems. Furthermore, as in our study, SAPS II displayed the best calibration (
p = 0.682). Patients with primary brain stem hemorrhage are, however, a specific group of stroke patients as their prognosis is significantly worse to other ICH patients. Additionally, in a large prospective study investigating the role of APACHE II in prediction of outcome after acute intracerebral hemorrhage,
Huang Y et al. found the mortality prediction of APACHE II to correlate well with the observed outcome (
r = 0.84,
p < 0.001) [
30]. The primary endpoint used was 3-month mortality, while we used six-month mortality as the primary outcome.
In this study, SOFA showed significantly poorer performance compared to the other models. This can be explained by the nature of the score itself. First, SOFA is an organ dysfunction score, originally designed to detect the degree of organ dysfunction instead of predicting outcome in critically ill patients. Second, the score is constructed of the level of dysfunction of six organ systems (cardiovascular, respiratory, hepatic, renal, coagulation, central nervous system) aiming to describe the degree of multi-organ failure which is common in sepsis, whereas ICH is more of a single organ problem, although multi-organ failure may occur [
31]. In a large retrospective study investigating causes of death after ICH,
Zurasky et al. found that only 9% of the deaths were due to non-neurologic reasons whereas neurological condition was the cause of death in overwhelming majority [
32]. Also, SOFA does not consider patient age, which is a major prognostic factor in ICH patients.
Mortality in our sample is in line with previous studies, the six-month mortality being 48%.
Huang KB et al. reported a 30-day mortality of 41% [
29], whereas the three-month mortality was 40% in the study conducted by
Huang Y et al. [
30]. However, the mortality in the study conducted by
Handshu et al. was substantially higher compared to all others, as the 90-day mortality was as high as 59% and one-year mortality being 68% [
13].
In summary, the discriminative performance of a simple prognostic model composed of only age and GCS was equivalent to that of the more complex intensive care severity scores in patients with spontaneous ICH treated in the ICU. Thus, in regard to discriminative power, the age and GCS score based model can replace the previous severity scores. Yet, all models showed relatively poor calibration in predicting six-month mortality. Thus, as the clinical utility of a predictive model is influenced by both its discrimination and calibration [
33] additional studies are necessary to improve the quality of predictive models used for quality assurance and research in intensive care for patients with spontaneous ICH. Furthermore, future studies should also take into account radiological parameters of the ICHs to improve the prognostic accuracy.
Strengths and limitations
The major strength of our study is its adequate power to detect an effect, as our sample size is large, consisting of 3218 patients and up to our knowledge the largest study of this type published so far. Also, the majority of all ICUs within one country were involved, which improves generalizability. An additional strength of the study is the high quality of the database used [
16]. There are, however, some limitations to this study that deserve attention. First, as the study is retrospective in nature we are restricted to the data available in the database. The FICC database is not a specific neurological ICU-database and it does not include variables that may be of specific interest in ICH patients, such as radiological data or information regarding use of anticoagulation medication. Thus, we were unable to get data on measures of ICH radiological parameters such as hematoma volume, intraventricular hemorrhage, and ICH location. Therefore, we are unable to study the performance of radiological scores, such as the ICH score, which has proved useful [
34]. Second, as the management practices differ and ICU admission criteria are not equal, our findings may not be generally applicable to different healthcare systems in all cases.