In this large study regarding mortality prediction in ALF patients with vvECMO support, we found that both the PRESERVE and ECMOnet score were suboptimal to predict mortality accurately in our population. Two novel risk models were developed showing improved predictive ability. The addition of parameters available the first day after ECMO initiation enhanced mortality prediction. The combination of predictors found to be associated with the most optimal risk prediction in this study add evidence to the hypothesis that high comorbidity and unresponsive respiratory failure are important determinants of mortality following vvECMO support.
The SOFA score was designed for and has become well-integrated in clinical practice as an easily available bedside tool to evaluate organ failure/dysfunction over time in ICU patients. As recently described, a high pre-ECMO SOFA score has been associated with a higher mortality risk [
13].
The ECMOnet score is an additive score (range 0 to 10) based on baseline characteristics of 60 patients with severe ARDS due to suspected or confirmed H1N1-influenza virus infection [
6]. In our study population, the ECMOnet score did not show better discrimination than the SOFA score. Comparison of original and recalibrated estimates revealed considerable differences between the study populations, but continued poor discrimination after recalibration indicated that the combination of predictors was not useful in our study population. The ECMOnet score has been externally validated twice in patient populations with a similar profile as the original study [
6,
14]. Recent results indicate that patients receiving ECMO due to influenza-A virus infection have a lower mortality than patients presenting with other causes of ALF [
13]. In addition, the ECMOnet study excluded patients with pre-ECMO mechanical ventilation >7 days, leaving a patient population that may have a more favorable risk profile [
4]. Although a more homogenous study population improves performance and stability of a risk prediction model, this may also limit its general applicability.
The PRESERVE score was developed in a multicenter study comprising 140 patients with severe ARDS. The additive score (range 0 to 12) consists of eight predictors. To our knowledge, this study is the first to externally validate the score. Minimal differences in predictor distributions as well as similar overall mortality rates support the comparability of the original and validation study populations. There was a clear linear trend in the risk of mortality across increasing subgroups of PRESERVE score. However, the observed differences in the subgroup mortality rates and the reduction in PSEP may be related to overfitting from the original analysis [
11]. Again, in our population the PRESERVE score did not show better discrimination than the SOFA score.
Novel risk models
Based on the confirmed usefulness of the established scores, we identified three strengths in our study which justify our attempt to develop a new risk model: First, the present study (n = 304) is the largest study investigating mortality prediction in ALF patients receiving vvECMO support. Second, the availability of a broad range of variables potentially related to ECMO outcome increases the chances of finding an improved combination of predictors for mortality following vvECMO. Third, the rigor of variable selection and the statistical methods applied have been designed to avoid overfitting the models to the study population [
15‐
17].
The final models described in the present study represent combinations of variables that contain the most useful information to predict ECMO mortality. The models are not designed to reflect causal relationships between single predictors and outcome, which would only be possible if all factors that influence mortality were known. The included predictors may not play causal roles themselves, but carry information from one or several other predictors and thus represent markers for other causal relationships.
The composition of predictors in Models 1 and 2 underline the importance of the patient’s underlying health condition and regenerative capacity for prediction of ECMO outcome. Advanced age and chronic immunosuppression, both associated with reduced functional reserves, high comorbidity and reduced ability to recover, have been consistently associated with increased mortality [
7,
9,
18]. In accordance with previous findings, we found the necessity of high minute ventilation before ECMO to be an important predictor of mortality [
9]. A high pre-ECMO serum lactate concentration could indicate tissue hypoxia with subsequent metabolic acidosis and severe organ injury. Parallel to reduced hematocrit described in the ECMOnet study, low pre-ECMO hemoglobin concentrations were associated with an increased mortality risk. Possible causes of pre-ECMO anemia include hospital-acquired anemia, iron deficiency anemia or chronic disease. Preoperative anemia has repeatedly been described as an independent risk factor for increased mortality in cardiac as well as noncardiac surgery [
19,
20]. The harmful effect of anemia is greater than the increased risk explained by the need for transfusion [
20]. Thus, associated comorbid conditions may confound the role of anemia as a risk factor.
The improved performance with Model 2 indicates that ECMO relieves patients from acute respiratory and hemodynamic stress and provides time to recover already within the first 24 hours. In patients with severe hypoxemia or high cardiac output, where ECMO may not improve gas exchange sufficiently to decrease aggressiveness of mechanical ventilation, continued invasive ventilation with high volume and FiO
2 may bring about a vicious cycle with ventilator-induced lung injury [
4]. Thus, sparse reduction or continued need for a high FiO
2 and norepinephrine on day 1 may indicate patients with unresponsive respiratory failure despite ECMO.
Furthermore, day 1 plasma concentrations of fibrinogen and C-reactive protein (CRP) were identified as important predictors of ECMO outcome. Surprisingly, low CRP was associated with increased risk of mortality. This might be related to the ability to activate the immune system in order to defend against on-going infection and trigger recovery mechanisms. Whereas survivors generally had higher CRP before and one day after ECMO implantation, they had lower CRP concentrations post-ECMO, leading to the hypothesis that survivors were able to quickly activate an effective immune response. The lower CRP in non-survivors, on the other hand, might indicate immune exhaustion or liver failure, with a failing attempt to bring about an effective response in order to establish control of their acute illness.
The protective effect seen with higher fibrinogen concentrations may have several explanations. First, it may be another sign of an effective immune response, supported by the role of fibrinogen as an acute phase reactant. Given that both fibrinogen and CRP remained significant, they both provide information in addition to that from the other factor, even if the two may be somewhat correlated. Second, low fibrinogen levels are also associated with bleeding disorders, large-volume blood transfusions and disseminated intravascular coagulation, which are all associated with higher mortality. Fibrinogen may thus be an informative marker replacing the need for many individual markers and may contain information from several functions that affect outcome following ECMO support.
All together, the parameters from Model 2 enhanced discrimination between patients with high regenerative capacity and reversible disease from those with poor health condition, reduced functional reserves and intractable illness.
This study illustrates the difficulties in creating a robust model for predicting mortality following vvECMO. Model 1 showed improved goodness-of-fit and somewhat improved discrimination; however, the discriminative ability was not significantly enhanced from the PRESERVE score. ECMO patients have a large heterogeneity in disease and health conditions, and behind ‘mortality of all causes’ there is a large diversity in underlying causality. Patients have different diagnoses, time courses, ages, comorbidities, different sites and types of infection, and different degrees of physiological dysfunction [
21]. Therefore, inclusion of day1 parameters seemed a promising approach to better prediction. The external scores calculated on day 1 data did not provide adequate risk prediction, supporting the usefulness of our new model.
It would be of interest to develop a prediction model for use later during the course of ECMO, for example on day 7. Such a model might aid in the decision to wean the patient from ECMO. However, due to smaller patient numbers caused by earlier weaning or death, a preliminary day 7 model for the remaining 193 patients (126 survivors and 67 non-survivors) showed wide CIs for the odds ratios and was not significantly better at predicting mortality than the day 1 model (P = 0.25, data not shown). Thus, it seems that a substantially larger study would be needed. Furthermore, any model aimed at predicting withdrawal of futile therapy has to be extremely accurate in order not to lead to erroneous decisions. From our present experience with 350 patients, we have observed previously unthought-of capacities of injured lungs to recover.
In this study, different statistical approaches were explored and compared in order to derive the best and most robust prediction model. While previous models have based their variable selection on univariate analyses, we employed a method combining clinical experience and judgment with computer-based statistics. Nevertheless, the inhomogeneous study population makes overfitting a persisting challenge and the ability of a predictive model to estimate the prognosis of individual patients with high accuracy remains limited. The usefulness of a prediction model rather lies in the assistance it may provide for the discrimination between higher- and lower-risk patients in order to help identify potential candidates for ECMO support and give guidance for rational and ethical resource utilization [
6,
11]. We, therefore, categorized patients into three groups with significantly different probabilities of survival. It may be argued that our defined high-risk group was very small, but we kept this cut-off because of its potential clinical usefulness. Categorization of patients may also be helpful for future external and internal validations in order to compare institutional variations in indications and practice of ECMO. However, although the risk score may function as a useful supplementary tool for clinicians, thorough clinical evaluation on a case-to-case basis still remains the cornerstone of ECMO handling.
Study limitations
Conventional rescue therapies, such as prone positioning and use of neuromuscular blockade, were not registered in all patients in the UKR database. A part of our study population dates back to 2008 when the advantages of prone positioning were not universally accepted and neuromuscular blockade was only rarely used in Germany. Hence, the quality of conventional treatment which may be associated with mortality irrespective of the other factors included in the model could not be evaluated in this study. In the calculation of the PRESERVE scores, all patients were assigned to ‘no prone positioning pre-ECMO.’ However, this may contribute to overestimating mortality risk with PRESERVE in our patient population. Furthermore, the PRESERVE score used mortality six months post-ICU discharge as the end-point, while the present study only followed patients up to hospital discharge. Mid- and long-term outcomes are important in ARDS patients receiving vvECMO support. Unfortunately, in the past, long-term outcomes of our patients and health-related quality of life were not reevaluated, as many patients were retrieved from distant hospitals after our team implanted ECMO as a rescue procedure to allow transport to our center. We plan to carry out a follow-up study to evaluate the validity of the presented models in the prediction of mid- and long-term mortality.
Missing observations due to incomplete documentation for some laboratory and respiratory parameters reduced the effective sample size in the statistical analyses. Bias due to cohort selection or missing data in the presented analyses seems unlikely since the overall mortality rate was constant in all subsamples with complete information for the different parts of the study. Since validation through bootstrap methods is preferred over data splitting methods [
17], we used the whole patient cohort for model development. However, external validation is a crucial step before these models can be applied in clinical practice.