Introduction
Sudden death (SD) and pump failure death (PFD) are the predominant modes of death in patients with heart failure and reduced ejection fraction [
1]. Quantifying an individual patient’s risk for mode-specific mortality can help with complex and difficult decisions about costly interventions, such as an implantable cardioverter defibrillator (ICD) or a left ventricular assist device, which are aimed at preventing specific causes of death [
2].
One recent guideline suggests that risk calculators may be helpful in estimating an individual patient’s benefit/risk of an ICD implantation [
3]. However, most existing risk models in patients with HF focus on predicting all-cause mortality [
4‐
7]. Few models have been developed specifically for different modes of death and those that exist have some limitations. Statistically, several models are limited by having few events [
8], most failed to take into account the prognostic influence of death from other causes [
9], and crucially, none were externally validated [
8‐
10], which is preferable for a model to be considered in clinical practice. Clinically, many models were built in cohorts in which few patients received modern evidence-based medications [
10,
11]. In particular, the Seattle Heart Failure Model (SHFM) [
4], designed to predict all-cause death, and which has also been shown to predict SD and PFD with good performance, was developed before the widespread use of beta-blockers and mineralocorticoid receptor antagonists (MRAs) [
12]. Very recently, based on the same population, the authors developed the Seattle Proportional Risk Model (SPRM) to predict the proportion of deaths due to SD rather than the absolute risk [
13]. It is unclear whether these models still perform well when applied to a contemporary cohort and, as recently demonstrated, the risk of sudden death has declined in parallel with improvements in medical therapy [
14].
Theoretically, SD and PFD are two types of death with distinct risk profiles, and it is of interest to understand the potential association between different prognostic variables and each mode of death, especially in a single cohort and accounting for the competing risk of death from other causes.
The aims of this study were to develop and validate prognostic models separately for SD and PFD in patients with HFrEF, to compare the prognostic profiles of these modes of death, and to validate the SHFM and SPRM using the contemporary cohorts from the Prospective Comparison of ARNI with ACEI to Determine Impact on Global Mortality and Morbidity in Heart Failure Trial (PARADIGM-HF) [
15] and the Aliskiren Trial to Minimize Outcomes in Patients with Heart Failure (ATMOSPHERE) [
16].
Methods
Study population
This study consisted of a derivation cohort of patients in PARADIGM-HF and a validation cohort in ATMOSPHERE. Patients having an ICD or cardiac resynchronization therapy with a defibrillator (CRT-D) were excluded as these devices selectively reduce the risk of one of the two modes of death of interest. The design and results of both studies are published [
15,
16].
Briefly, PARADIGM-HF evaluated the effect of LCZ696 with enalapril in 8399 patients with a left ventricular ejection fraction (LVEF) ≤ 40% (changed to ≤ 35% by amendment) and NYHA class II-IV HF, in addition to recommended treatment including an angiotensin converter enzyme (ACE) inhibitor or angiotensin receptor blocker (ARB) and a beta-blocker (unless contraindicated) and a MRA (if indicated). Patients were required to have a plasma B-type natriuretic peptide (BNP) ≥ 150 pg/ml (or N-terminal pro-BNP [NT-proBNP] ≥ 600 pg/ml), or a BNP ≥ 100 pg/ml (or NT-proBNP ≥ 400 pg/ml) and a HF hospitalization within the past 12 months. The key exclusion criteria included intolerance of ACE inhibitors or ARBs, a history of angioedema, symptomatic hypotension, a systolic blood pressure (SBP) < 100 mmHg at screening (< 95 mmHg at randomization), an estimated glomerular filtration rate (eGFR) < 30 ml/min/1.73m2, and a serum potassium level > 5.2 mmol/L at screening (> 5.4 mmol/L at randomization). Patients were accrued from December 8, 2009, through November 23, 2012 from 1043 centers in 47 countries, and the follow-up ended on March 31, 2014. The median follow-up was 27 months.
ATMOSPHERE compared aliskiren monotherapy and aliskiren/enalapril combination therapy with enalapril monotherapy in 7016 patients with NYHA class II-IV HF with a LVEF ≤ 35% and elevated plasma BNP levels (same criteria as in PARADIGM-HF). The main exclusion criteria were very similar to PARADIGM-HF, with more stringent requirements in renal function and serum potassium levels but a lower threshold of SBP. Patients were required to be treated with a beta-blocker (unless contraindicated) and could be treated with a MRA if felt to be indicated by the investigator. Patients were enrolled from March 13, 2009, to December 26, 2013 from 789 centers in 43 countries, and were followed up until July 31, 2015. The median follow-up was 36.6 months.
Both trials used a composite primary outcome of cardiovascular death or HF hospitalization. All patients provided written informed consent.
Outcomes
In each trial, all deaths were adjudicated by the same committee using pre-specified criteria, in a blinded fashion. The same definitions for modes of death were used. SD was defined as death occurring unexpectedly in an otherwise stable patient, further classified as death witnessed or patient last seen alive < 1 h previously, and death in a patient last seen alive ≥ 1 h and < 24 h previously. PFD was defined as death occurring in the context of clinically worsening symptoms/signs of HF without evidence of another cause of death, including death as a complication of the implantation of a ventricular assist device, cardiac transplant or other surgery primarily for refractory HF, and death after referral to hospice specifically for progressive HF.
Prediction variables
To identify predictors for each mode of death, a broad spectrum of baseline variables (
N = 62) were separately assessed in PARADIGM-HF (Table
1). These variables included demographics, clinical variables, medical history, ECG parameters, and laboratory tests including NT-proBNP. In each trial, patient demographics and medical history were collected at baseline, physical examination, blood pressure, pulse and anthropometrical measurements were also performed, and this information was recorded in the electronic case report form (eCRF) by the investigators. A 12-lead ECG was performed at baseline and interpretation of the tracing was made by a qualified physician and documented on the ECG section of the eCRF. All laboratory tests were performed in a central laboratory, according to the pre-specified laboratory manual with details about specimen collections, shipment of samples and reporting of results, except potassium values and eGFR. These two tests were performed in a local laboratory and eGFR was calculated using the Modification of Diet in Renal Disease (MDRD) equation. A full set of baseline variables was collected in most patients, and patients with missing values were excluded in these analysis (< 2.5%). No difference was observed between the overall randomized patients and the cohort with all baseline variables available.
Table 1
Baseline patient characteristics in the derivation and validation trials
Age-years | 63.7 ± 11.6 | 63.1 ± 12.1 | 0.01 |
Male sex—no. (%) | 5492 (76.7) | 4565 (76.5) | 0.73 |
Race—no. (%) | < 0.001 |
White | 4480 (62.6) | 3659 (61.7) | |
Black | 344 (4.8) | 95 (1.6) | |
Asian | 1480 (20.7) | 1716 (28.9) | |
Other | 852 (11.9) | 460 (7.8) | |
Region—no. (%) | < 0.001 |
North America | 275 (3.8) | 81 (1.4) | |
Latin America | 1372 (19.2) | 1077 (18.0) | |
Western Europe | 1423 (19.9) | 1225 (20.5) | |
Central Europe | 2625 (36.7) | 1737 (29.1) | |
Asia or Pacific region | 1461 (20.4) | 1848 (31.0) | |
Body mass index | 28.0 ± 5.5 | 27.2 ± 5.3 | < 0.001 |
Blood pressure-mmHg | |
Systolic | 122.0 ± 15.4 | 124.4 ± 18.2 | < 0.001 |
Diastolic | 74.2 ± 10.0 | 77.6 ± 11.0 | < 0.001 |
Heart rate-beats/min | 72.9 ± 12.1 | 72.4 ± 12.7 | 0.008 |
LVEF-% | 29.9 ± 6.1 | 28.8 ± 5.5 | < 0.001 |
NYHA class-no. (%) | < 0.001 |
I | 347 (4.9) | 164 (2.7) | |
II | 4988 (69.8) | 4030 (67.5) | |
III | 1756 (24.6) | 1718 (28.8) | |
IV | 54 (0.8) | 56 (0.9) | |
Ischemic etiology-no. (%) | 4204 (58.7) | 3232 (54.2) | < 0.001 |
HF duration-no. (%) | < 0.001 |
within 1 year | 2391 (33.4) | 2229 (37.4) | |
> 1–5 years | 2781 (38.9) | 2197 (36.8) | |
> 5 years | 1984 (27.7) | 1538 (25.8) | |
Medical history-no. (%) |
Current smoking | 1008 (14.1) | 741 (12.4) | 0.005 |
Previous HF hospitalization | 4459 (62.3) | 3490 (58.5) | < 0.001 |
Myocardial infarction | 2919 (40.8) | 2228 (37.3) | < 0.001 |
Angina | 1944 (27.2) | 1415 (23.7) | < 0.001 |
Stable angina | 1547 (21.6) | 1108 (18.6) | < 0.001 |
Unstable angina | 768 (10.7) | 600 (10.1) | 0.21 |
CABG or PCI | 1951 (27.3) | 1475 (24.7) | 0.001 |
Hypertension | 5101 (71.3) | 3725 (62.4) | < 0.001 |
Diabetes | 2406 (33.6) | 1629 (27.3) | < 0.001 |
Atrial fibrillation | 2621 (36.6) | 2002 (33.5) | < 0.001 |
Stroke | 596 (8.3) | 419 (7.0) | 0.005 |
Cancer | 320 (4.5) | 186 (3.1) | < 0.001 |
Asthma | 249 (3.5) | 179 (3.0) | 0.12 |
COPD | 876 (12.2) | 625 (10.5) | 0.002 |
AAA | 82 (1.1) | 61 (1.0) | 0.50 |
PAD | 610 (8.5) | 461 (7.7) | 0.10 |
Medication-no. (%) |
Digoxin | 2232 (31.2) | 1940 (32.5) | 0.11 |
Diuretics | 5709 (79.8) | 4713 (79.0) | 0.25 |
ACE inhibitors | 5527 (77.2) | 5968 (100.0) | < 0.001 |
ARBs | 1646 (23.0) | 85 (1.4) | < 0.001 |
Beta-blockers | 6610 (92.4) | 5423 (90.9) | 0.002 |
MRAs | 3969 (55.5) | 2109 (35.3) | < 0.001 |
Any antiplatelet agents | 3988 (55.7) | 3251 (54.5) | 0.15 |
Aspirin | 3653 (51.0) | 3021 (50.6) | 0.63 |
Anticoagulants | 2173 (30.4) | 1633 (27.4) | < 0.001 |
Statins | 3796 (53.0) | 2893 (48.5) | < 0.001 |
Pacemaker | 513 (7.2) | 358 (6.0) | 0.007 |
CRT-P | 136 (1.9) | 107 (1.8) | 0.65 |
12-lead ECG-no. (%) |
QRS duration -msec | 114.3 ± 31.6 | 114.5 ± 31.6 | 0.78 |
Atrial fibrillation | 1866 (26.1) | 1434 (24.3) | 0.02 |
Atrial flutter | 66 (0.9) | 52 (0.9) | 0.80 |
Bundle branch block | 1965 (27.5) | 1659 (28.1) | 0.43 |
Left bundle branch block | 1440 (20.1) | 1245 (21.1) | 0.18 |
Right bundle branch block | 552 (7.7) | 441 (7.5) | 0.59 |
Q wave | 1247 (17.4) | 1108 (18.8) | 0.05 |
Left ventricular hypertrophy | 1423 (19.9) | 1189 (20.1) | 0.73 |
Paced rhythm | 421 (5.9) | 296 (5.0) | 0.03 |
Laboratory measurement |
eGFR < 60 ml/min/1.73 m2-no. (%) | 2462 (34.4) | 1467 (24.6) | < 0.001 |
eGFR-ml/min/1.73 m2 | 68.8 ± 20.3 | 75.1 ± 24.7 | < 0.001 |
Creatinine-mg/dL | 1.10 ± 0.29 | 1.02 ± 0.27 | < 0.001 |
BUN-mmol/L | 7.2 ± 2.9 | 7.2 ± 2.9 | 0.09 |
Albumin-g/L | 42.8 ± 3.2 | 43.2 ± 3.6 | < 0.001 |
Hemoglobin-g/L | 139.4 ± 16.2 | 137.4 ± 16.6 | < 0.001 |
Potassium-mmol/L | 4.51 ± 0.48 | 4.46 ± 0.47 | < 0.001 |
Sodium-mmol/L | 141.5 ± 3.0 | 139.7 ± 3.3 | < 0.001 |
Chloride-mmol/L | 103.9 ± 3.4 | 103.8 ± 3.6 | 0.24 |
Calcium-mmol/L | 2.32 ± 0.11 | 2.33 ± 0.12 | 0.06 |
Total cholesterol-mmol/L | 4.59 ± 1.16 | 4.54 ± 1.20 | 0.005 |
HDL-C-mmol/L | 1.24 ± 0.37 | 1.24 ± 0.38 | 0.50 |
LDL-C-mmol/L | 2.58 ± 0.95 | 2.52 ± 1.00 | < 0.001 |
Triglyceride-mmol/L | 1.71 ± 1.18 | 1.75 ± 1.21 | 0.077 |
NT-proBNP-pg/mla | 1640 (888–3342) | 1204 (630–2285) | < 0.001 |
Statistical analysis
The baseline characteristics by cohort were compared using Student’s t test or Mann–Whitney U test as appropriate for continuous variables, and Chi-square test for categorical variables. For each mode of death, the event rate was calculated per 100 patient-years, and the cumulative incidences over time were plotted and compared by cohort using the Pepe–Mori test which counted death from other causes as a competing risk.
A univariable Fine and Gray sub-distribution hazards model was first performed to assess the influence of each prediction variable on the cumulative incidence of each mode of death [
17]. For each continuous variable, linearity was examined using the restricted cubic spline method. If the response appeared nonlinear, certain cut-off values or transformation were applied according to the spline curves and clinical relevance. For categorical variables, appropriate dummy variables were used. The validity of the proportional sub-distribution hazards assumption was examined using time varying terms. For each variable, the statistical strength for predicting each mode of death was quantified by
Χ2 values with one degree of freedom.
For each outcome, we used a multivariable Fine–Gray model with backward stepwise selection based on Akaike information criterion (i.e., equivalent to
p = 0.157), starting with a full model including all candidate variables. The predictor selection process was repeated in 200 bootstrap samples, each was sampled with replacement from the original PARADIGM-HF dataset with the same sample size as the original. To minimize the chance of inclusion of weak and uninformative predictors which might lead to model overfitting and optimism, we included variables in the final model that were retained in > 50% of all bootstrap datasets and were statistically significant. Since LVEF is an established prognostic factor for pump failure death, we included it in the final model regardless of the abovementioned inclusion criteria. For each mode of death, the final model was refitted into 200 bootstrap samples to get the average predictor coefficients. These averaged coefficients were used to calculate the individual risk score which is the sum of the products of each predictor value and its corresponding coefficient. Predicted cumulative incidences over time by quartile of risk scores were plotted against the observed Aalen-Johansen estimators to assess model performance [
18]. Model calibration was examined by comparing observed-predicted pairs of curves in each quartile over time. Model discrimination was examined by visually assessing the spread of each set of curves (the wider the better) and by calculating Harrell’s C and C-index at 1-, 2-, and 3-year adjusting for right censoring [
19].
To correct for optimism, internal validation was undertaken by bootstrapping approach. In detail, the C statistic of the derived model was determined in each bootstrap sample from which it was generated, and also in the original dataset, and the difference between these two C statistics was calculated and then averaged over 200 samples to give an estimate of the optimism. The optimism corrected estimate of the C statistic was then calculated as the naïve C statistic minus the estimated optimism. External validation was performed in the ATMOSPHERE cohort by fitting a univariable Fine–Gray regression on risk score which was the sum of average coefficients of predictors for each model from PARADIGM-HF multiplied by its corresponding predictor values in ATMOSPHERE. Model performance in validation was assessed using the same approach mentioned above.
To determine whether the prediction variables had a different effect on each outcome, all predictors from both models were fitted into cause-specific Cox regression models using the Lunn–McNeil method [
20].
To validate the SHFM in contemporary cohorts and to compare our models with the SHFM, a SHFM score was calculated for each patient in PARADIGM-HF and ATMOSPHERE and the ability of the SHFM to discriminate between SD and PFD was assessed. We also validated the SPRM in both cohorts using logistic regression analysis and assessed its discrimination using Receiver Operating Characteristic Area Under the Curve (ROC AUC), an equivalent to Harrell’s C.
A two-tailed p < 0.05 was considered significant. The cumulative incidence function and C-index were achieved using the ‘cmprsk’ and ‘pec’ packages in R project (version 3.2.3). Other analyses were performed using STATA software (version 14.0 SE).
Discussion
We developed and validated separate prognostic models for SD and PFD in patients with HFrEF enrolled in PARADIGM-HF and ATMOSPHERE, the two largest and most contemporary trials in HF, using a competing risk analysis approach. Both models showed good discrimination and calibration and remained robust in the external validation.
The potential value of estimating the risk for mode-specific death, and in particular SD, in individual HF patients, has recently been reinforced by the results of the Danish Study to Assess the Efficacy of ICDs in Patients with Non-ischemic Systolic Heart Failure on Mortality (DANISH) [
21]. In DANISH, ICD treatment did not reduce overall mortality in patients at low risk of SD as a result of excellent contemporary therapy [
21]. Older individuals, with more co-morbidity, were least likely to benefit, probably because they had a higher competing risk of PFD, and non-cardiovascular causes of death, both of which would not be reduced by an ICD [
21,
22]. This trial raises the question of whether ICD implantation might be better targeted to individuals at highest risk of SD [
21,
23]. As the contemporary risk of sudden death declines, concern has been expressed as to whether the benefits of ICDs outweigh the risks of these devices when applied in a relatively un-targeted way [
14,
24]. For example, in a recent nationwide analysis of complications after primary prevention ICD implantation in ambulatory patients in the USA, the device-related mortality rate was reported to be 0.73% at 30 days, with a total serious complication rate of 8.4% [
25]; similar data have been reported from other countries [
26].
At least one recent guideline has suggested that validated risk calculators/risk assessment tools may “aid in the estimation of each patient’s benefit/risk of an ICD implantation” [
3]. Several models for predicting modes of death in HF already exist. However, because they all have limitations, none has gained widespread acceptance in current clinical practice. Older models were developed before the broad utilization of contemporary evidence-based medications, e.g. beta-blockers and MRAs [
10,
11]. More recently, separate risk scores for SD and PFD were reported among HF patients with unspecified left ventricular function in the MUSIC study [
8]. Although the models offered excellent discrimination, with c-indices of 0.77 for SD and 0.80 for PFD, they were based on a small number of events (90 SD and 123 PFD) and few candidate predictors. Models for SD and PFD were also developed in CORONA but only patients with an ischemic etiology were included in that trial [
9]. Moreover, the CORONA model did not include routinely collected variables, such as serum chloride and albumin. Likewise, the HF-ACTION investigators only assessed the additional mode-specific death information gained from adding biomarkers, i.e. NT-proBNP, galectin-3 and soluble ST2, to a clinical model developed previously for all-cause death [
6,
27]. No prediction model or risk score was provided and over 46% of patients in HF-ACTION had an ICD in situ. Importantly, none of the models mentioned accounted for competing risks from other deaths or were validated in an independent population.
Although the SHFM reported good discrimination for predicting SD and, particularly, PFD, comparable to the models developed in this study [
4,
12], when it was applied to PARADIGM-HF and ATMOSPHERE, there was a substantial decline in its ability to discriminate, indicating a significant loss of power to predict mode-specific death in a contemporary population receiving evidence-based medications according to current guidelines. Moreover, the predictive variables in SHFM more reflect overall survival, and lack specificity for each mode of death.
The SPRM was recently developed to predict the proportion of mortality due to SD rather than the absolute risk of SD [
13,
28,
29]. Using the predicted annual total mortality rate derived from SHFM, the authors attempted to identify a subset of patients who would benefit most from ICD, based on having a high risk of SD but a low risk of dying from other causes. However, when this bi-modal system was applied to each of our more contemporary cohorts, it yielded poor discrimination, assigning most patients to ICD implantation. This poor performance in identifying potential candidates for ICD implantation may reflect a difference in the underlying risk across the cohorts, particularly the proportion of sudden to overall death in the validation cohorts (< 40%) and the derivation cohort (48%) [
13,
15,
16]. Thus, the intercept from the original SPRM may not be transportable, and its direct application may lead to the predicted proportional risk being systematically higher in validation cohorts. However, in patients with non-ischemic heart failure randomized in the DANISH trial, ICD use was associated with a lower mortality among patients with both a SPRM and SHFM score above the median, i.e. these scores may be better at predicting response to ICD therapy than in identifying patients for implantation, at least in patients with a particular etiology (30).
The models developed in our study have some unique features and, as a result, strengths. They are based on a large contemporary population with a substantial number of patients receiving modern evidence-based therapies. Additionally, we examined a broad spectrum of candidate variables which are currently assessed in clinical practice, many of which have been reported to predict SD [
13,
30‐
32], including demographics, physical examination, medical history, treatment, ECG, routine biochemical tests and new biomarkers (such as NT-proBNP). Also, death from other causes was treated as a competing risk rather than non-informative censoring, which diminishes bias related to each individual mode of death [
33]. More importantly, our models were validated with robust results in an independent cohort. Given the geographically and ethnically diverse cohorts included, our models should be generalizable to a broad range of contemporary patients with HFrEF.
Of special interest are the similarities and differences between predictive variables for each mode of death. Advanced NYHA class, lower SBP and elevated NT-proBNP levels were predictive of both modes of death (and there was a strong trend for ECG QRS duration). Three variables showed a similar directional association with each mode of death but a stronger relationship with one mode over the other: longer duration of HF (PFD), serum albumin (PFD) and chloride (PFD), all indicators of more advanced heart failure. Four variables had directionally opposite relationships with each mode of death. Ischemic etiology was independently associated with a higher risk of SD but with a lower risk of PFD (a similar trend was seen for ECG left ventricular hypertrophy). Higher creatinine was associated with a higher risk of PFD and history of cancer was associated with a lower risk of SD.
Although LVEF is a well-known predictor for SD, and is recommended as the key criterion for selecting ICD recipients [
2,
31], we found it was neither independently associated with SD nor differentiated between SD and PFD in the present models. This may reflect the relatively narrow range of LVEF among patients enrolled in PARADIGM-HF and, possibly, the inclusion of NT-proBNP in our models. NT-proBNP level was somewhat more strongly associated with the risk of SD than PFD, although the difference was not statistically significant. This hypothesis might also explain the under-estimation of the rate of SD in the highest risk quartile in the validation model as NT-proBNP concentration was slightly lower in ATMOSPHERE than in PARADIGM-HF.
There are several limitations to the present analysis. First, our models were built and validated in clinical trials rather than in “real-world” cohorts, that is, patients in trials tend to be healthier, have less co-morbidity and be more likely to receive evidence-based therapies. However, it is in patients similar to those in the present study in which ICDs are most clearly indicated. Second, our SD model was less discriminative than the PFD model, as previously reported for other models of SD [
8,
12]. Some variables reported predictive of SD and PFD were not measured in PARADIGM-HF including echocardiographic parameters, ambulatory ECG findings [
8], and other biomarkers [
27]. Third, ICDs can change the mode of death in a given patient. Although patients with an ICD at baseline were excluded, we cannot rule out the potential confounding effect of ICDs implanted after randomization, although there were few such cases (2.7%). Furthermore, even if mode-specific death is appropriately classified and predicted by the models with reasonable accuracy, this might not translate into prediction of the response to treatment. This is particularly because not all sudden deaths are electrical and preventable by an ICD (some may be due to other types of cardiovascular events). Lastly, we did not account for heart transplantation and ventricular assist device implantation during follow-up, although there were very few such procedures.