Main

Pulmonary embolism (PE) is one of the most common and feared complications in cancer patients, given its frequency and the suffering it entails (Sørensen et al, 2000). There is a 15–30% prevalence rate of PE in necroscopic series for the most thrombogenic tumours, owing to interactions between the mechanisms of tumorigenesis, haemostatic activation, and other factors (Svendsen and Karwinski, 1989). The introduction of multidetector computed tomography (CT) has increased detection rates of incidental PE, present in 2–8% of the studies performed in cancer patients (Dentali et al, 2010). In some recent series, incidentally diagnosed PE accounted for a50% of embolic events (Font et al, 2014, 2016). On the other end of the spectrum of severity, PE is also a common cause of fatal events in daily practice, as well as in trials with new targeted therapies (Ranpura et al, 2011; Den Exter et al, 2013).

Classifying prognosis in PE is important, in as much as episodes classified as low risk might be eligible for support reductions (e.g., outpatient management or early discharge, etc.), thereby lowering costs and enhancing patient comfort without compromising safety. In contrast, subjects at higher risk should receive stepped up care or monitoring (Streiff et al, 2013). Different studies have identified several prognostic factors for cancer-associated symptomatic PE, most decisive among them being the presence of metastasis, immobilisation, low weight, or altered vital signs (Kline et al, 2012; Den Exter et al, 2013; Font et al, 2014, 2016). Several prospective, cohort studies have based selection of low-risk patients eligible for outpatient treatment on pragmatic clinical decision rules (CDR), such as the HESTIA study eligibility criteria, which are based prominently on altered vital signs and risk of bleeding (Siragusa et al, 2005; Zondag et al, 2011; Font et al, 2014; Weeda et al, 2016).

On the other hand, prognostic multivariate models have been created, such as the RIETE registry scale and POMPE-C score, that predict 30-day mortality probability following PE (Kline et al, 2012; Den Exter et al, 2013); although, at best, they are marginally superior to other classifications developed for PE in the general population (e.g., PESI or sPESI; Carmona-Bayonas et al, 2016). Nevertheless, using any of them to assist in decision-making involves problems, not least of which is that their suitability for incidental PE has yet to be proven (Kline et al, 2012; Den Exter et al, 2013). Furthermore, they are not sensitive to competitive risks, such as increased bleeding, responsible for some 10% of early mortality, or cancer progression, which accounts for 50% of 30-day mortality after a PE event (Den Exter et al, 2013; Carmona-Bayonas et al, 2016).

Consequently, there is no adequate prognostic stratification method for incidental and symptomatic PE. In this study, we have attempted to refine the classification of the entire spectrum of cancer-associated PE by combining an adaptation of the HESTIA criteria with other explanatory covariates and modelling a decision tree procedure.

Materials and methods

Patients

The source of information is an observational registry of consecutive cases of cancer-associated PE, who received care at 14 Spanish hospitals between 2004 and 2015 (Registro de Embolia Pulmonar en Pacientes con Neoplasias, EPIPHANY registry for its Spanish acronym). This registry’s design, methods, and characteristics have been previously reported in depth (Carmona-Bayonas et al, 2016; Font et al, 2016; Plasencia-Martínez et al, 2016). Briefly put, the basic eligibility criteria required that patients be adults (18 years) with a PE diagnosis confirmed by means of objective imaging (CT angiography scans, high probability scintigraphy, or CT scheduled to assess tumour response or for other reasons). In order to choose a truly oncological population, subjects were withdrawn from the study if the PE had occurred more than 1 month prior to the diagnosis of cancer, or if more than 1 month had elapsed since completing adjuvant chemotherapy. Patients were also excluded if they had not received anticoagulant therapy without justification according to international clinical practice guidelines (Streiff et al, 2013).

Given that the study included a prospective observation component until closure, in case of multiple events, only one was considered to be the index PE, defined as the evaluable PE closest to the time of recruitment. The remaining PEs in the same patient were considered ‘previous history’ if they took place prior to the index PE, or ‘recurrence’, if subsequent to it. The registry was approved by the local Ethics Committees at each centre; informed consent was obtained from all living participants.

Study design

The main objective of this study was to develop a prognostic model, the EPHIPANY index, for cancer patients and both incidental, as well as symptomatic PE. Given that it was a non-intervention database, the data reflect genuine clinical profiles and the decisions physicians make in line with their clinical practice. The data were collected from medical records or directly from the patients, together with clinicians with experience in cancer support treatment and radiologists who are subspecialised in diseases of the chest. All the investigators were trained in the study protocol requirements and the data were monitored in situ or by phone. The data were gathered by means of an electronic capture system, designed to refine inconsistencies and resolve data errors in real time. Data acquisition was not blinded. The minimum observation period was 3 months from the time PE was diagnosed, although longer follow-up was required whenever possible. The variables were collected during routine or unscheduled medical appointments.

Variables

The main outcome measure in this study was the occurrence of a serious medical condition between PE diagnosis on imaging and 15 days later. Serious complications are events that lead to serious clinical deterioration or death; for example, systolic blood pressure <90 mm Hg, acute respiratory failure, right-side heart failure, acute kidney failure, major bleeding, or any other event the investigator deems serious (Supplementary Table 1).

Other secondary end points were all-cause 30-day mortality, the cause of 30-day mortality, and 30-day venous or arterial rethrombosis. ‘Rethrombosis’ was defined as a second thrombotic event after appropriate PE treatment or progression of a previous venous thromboembolism (VTE) despite proper anticoagulant therapy. Rethrombosis was not considered to be a serious complication in the absence of the afore-named criteria. An autopsic diagnosis notwithstanding, researchers attributed the cause of death on the basis of a clinical history review and findings from complementary testing. Demise was deemed to be due exclusively to PE when there was a direct causal relationship through a concatenation of events associated to the thrombosis pathophysiology. ‘Mixed’ deaths were defined by the presence of a temporal relationship between patient demise and PE, although other intercurrent complications (e.g., infection or tumour progression) might plausibly play a relevant role. Death was considered unrelated to PE if there was no temporal relationship or concatenation of clear events. ‘Multiple’ was accepted as a response when there was a resumption of overlapping causes.

The potential explanatory covariates were selected after a bibliographic review and consultation with experts, taking into account their availability at patients’ bedside. Data recording did not allow for lost data for outcomes, survival times, and basic demographic and clinical characteristics (vital signs, tumour status, performance status, etc.).

The ‘CDR variable’ was defined as adaptation of Hestia’s study eligibility criteria used in previous studies (Zondag et al, 2011; Weeda et al, 2016). These criteria are typified by the presence of at least one of the following: systolic blood pressure <100 mm Hg, arterial oxygen saturation <90%, respiratory rate 30 breaths per minute, pulse 110 beats per minute, sudden or progressive dyspnoea, other serious complications, constituting admission criteria in and of themselves, and clinically relevant bleeding, high risk of bleeding, or platelets <50 000 mm−3. The CDR was assessed immediately prior to the time of radiological diagnosis of PE.

Other explanatory covariates included: age, gender, tumour stage, type of cancer, use of targeted cancer therapies, tumour response at the time of PE based on radiological criteria, Response Evaluation Criteria in Solid Tumors (RECIST) 1.1 (Eisenhauer et al, 2009), Eastern Cooperative Group Performance Status scale (ECOG-PS; Oken et al, 1982), chronic obstructive pulmonary disease, prior cardiovascular disease, chronic kidney failure, concurrent deep vein thrombosis or a history of VTE, development of PE during treatment for a previous VTE, troponin levels (normal or high), creatinine clearance (normal, low), incidental or symptomatic diagnosis of the PE, presence of PE-specific symptoms, right ventricular diameter, additional radiological findings, Qanadli index (Qanadli et al, 2001), interventricular septal anomalies, presence of a single or multiple PE, oxygen saturation, blood pressure, heart and respiratory rates, previous tumour bleeding, prior use of antiaggregants, and major surgery in the previous month. Standardised definitions were used for each variable (Supplementary Table 1).

Development of a decision tree model

Classifications based on decision tree modelling seek to discover how the outcome variable is linked to the potential explanatory factors and, specifically, the configuration of these factors. This method is considered appropriate as the contribution of the explanatory covariates cannot be assumed to be necessarily additive or linear (Yohannes and Hoddinott, 1999; Lewis, 2000). The Exhaustive CHAID algorithm builds a decision tree by means of repeated partitions of each subset into two or more child nodes, beginning with the full database (Biggs et al, 1991). This methodology was used to determine the strength of association between the presence of serious complications within 15 days and the previously mentioned potential predictors. To determine the best split in each node, the categories of each predictor were merged into pairs until statistically significant differences were no longer observed within each component of the pair in comparison to the target variable. The predictors that produced the most significant partitions were then recursively chosen. Thus, the algorithm identified the main interactions and built subgroups defined by the different sets of independent variables. The level of significance for splitting nodes (αsplit) was 0.05. The Bonferroni method was used to adjust the value of significance.

To cope with the overfitting and instability inherent to the decision tree, a 10-fold cross-validation procedure was applied. Thus, the data were randomly divided into 10 equal subsets. Trees were systematically built in 9 of those partitions (training subsets) and then tested in the remaining group (testing subset). Cross-validation produces a single, final model. ‘Risk’ is defined as the proportion of cases incorrectly classified by each of the individual trees; whereas the cross-validated risk estimate is the average of the risks of all the trees. This analysis was performed with the SPSS 23.0 software (SPSS Inc., Chicago, IL, USA). The bootstrap (1000 replications) optimism-corrected area under the receiver operating characteristic curve (ROC) was estimated using R software with the rms package (Harrell et al, 2015).

Finally, the percentage of serious complications was calculated in each of the terminal nodes of the tree and this was used to create the EPHIPANY rule, a simplified classification based on three categories: low, intermediate, and high risk. The predictive value of this classification was estimated by means of the odds ratio (OR) and its 95% confidence interval (95% CI) between the groups thereby generated and the appearance of the study end point. Cumulative hazards curves were calculated to establish the changes in hazard over time for each prognostic category.

Results

Patient characteristics

Patient characteristics are summarised in Table 1. Pulmonary embolism was incidentally diagnosed in 53% of the cases. Twenty-eight percent (n=302) of the episodes were treated at home. All patients received anticoagulation (initial therapy with low-molecular weight heparin in 92%). At the time of PE diagnosis, 73.6% of the patients had a metastatic tumour and 53.6% were receiving chemotherapy. The most common tumours were breast, lung, and colon cancer, accounting for 54.7% of the series. The recruitment process is illustrated in Figure 1.

Table 1 Baseline demographic and clinical characteristics
Figure 1
figure 1

Study flow diagram.CT, computed tomography; PE, pulmonary embolism.

Outcomes

The main end point of this study, serious complications within 15 days, occurred in 208 patients (19.3%; 95% CI, 17.1–21.8%). The 15-day mortality rate was 10.1% (95% CI, 8.4–12.1) and of the 109 patients who died within that period, 45 (41%) did so as a result of tumour progression and not PE (Table 2). The rates of embolic recurrence and major bleeding were 4% and 2%, respectively.

Table 2 Main outcomes

Decision tree

Figure 2 shows the decision tree model with the 15-day, serious complications data for each end node. The Exhaustive CHAID method selected six explanatory covariates from the initial 39: the Hestia-like CDR variable (any risk factor present vs none), ECOG-PS (<2 vs 2), oxygen saturation (<90 vs 90%), presence of PE-specific symptoms, previous tumour response evaluation (tumour progression, unknown, or not evaluated vs others), and prior surgical resection of primary tumour. While other additional nodes involving more variables could be generated, they did not provide any incremental risk discrimination.

Figure 2
figure 2

EPIPHANY Index for the prediction of serious complications.The bars show the percentage of patients with no complications (light gray) or with complications (dark gray) within each node. The ‘Clinical Decision Rule’ variable encompasses the following characteristics: (1) systolic blood pressure <100 mm Hg, (2) arterial oxygen saturation <90%, (3) respiratory rate 30 breaths per minute, (4) pulse 110 beats per minute, (5) sudden or progressive dyspnoea, and (6) clinically relevant haemorrhage, high risk of bleeding, or platelets <50 000 mm−3. The patient is classified as low or high risk according to whether they exhibit none of these characteristics or at least one of them. CDR, clinical decision rule; ECOG-PS, Eastern Cooperative Group Performance Status scale; PE, pulmonary embolism; SaO2, arterial oxygen saturation.

The best predictor in the root node was the Hestia-like CDR variable; the episodes that did not meet any of these criteria had a lower risk of serious complications within 15 days, in comparison with episodes that satisfied at least one of them (4.7 vs 29.7%; OR 0.11, 95% CI, 0.07–0.18; P<0.0001) and 15-day mortality (2.5 vs 15.6%; OR 0.13; 95% CI, 0.07–0.25; P<0.0001), respectively. The decision tree makes it possible to elaborate on the prognostic stratification in seven terminal nodes. For purposes of practicality, they are summarised into three risk categories: high, intermediate, and low. Supplementary Table 2 outlines the demographic and clinical characteristics of each subgroup.

Low risk encompasses patients without any Hestia-like CDR criteria, and with controlled tumours or resected primary tumours, with a risk of serious complications of between 1.4–3.4% and 0.3% 15-day mortality. Tumours with any of the CDR risk factors were at high risk (complications rate, 20–55%), with the exception of the group consisting of patients with a good performance status, and no PE-specific symptoms, who had an intermediate level of risk. This risk group would also include all stable patients having uncontrolled or unevaluated tumours, and without surgery for the primary tumour, with a 10.6% risk of complications. The cross-validated risk estimate is 0.191 (s.e.=0.012); the optimism-corrected value of the area under the ROC curve is 0.779 (95% CI, 0.717–0.840).

Outcomes according to risk groups are reported in Table 3. The risk of serious complications within 15 days increases with the group: 1.6, 9.4, 30.6%; P<0.0001. The risk of 15-day mortality also raises progressively, in patients of low, intermediate, and high risk: 0.3, 6.1, and 17.1%; P<0.0001. It is worth noting that high–intermediate-risk patients had increased risk of complications, with OR of 17.2 (95% CI, 7.7–40.3), P<0.0001, and death OR of 49.5 (95% CI, 6.8–356.9), P<0.0001.

Table 3 Outcomes in each risk group (n=1075)

Figure 3 illustrates the cumulative hazard function. Events are seen to be evenly distributed throughout the 15 days and do not cluster in the first hours following diagnosis of PE. The log-rank test reveals that the survival functions factored by prognostic categories are significantly different (P<0.0001).

Figure 3
figure 3

Cumulative hazard functions for serious complications.In this figure, cumulative hazard curves were plotted to show the change in hazards over time (days), for each prognostic category.

Discussion

This study reports on the development of a decision tree model to stratify any cancer patient with PE according to the risk of serious complications within 15 days. Unlike other prognostic tools, the EPIPHANY index is applicable across the entire spectrum of PE severity, including both incidental and symptomatic events (Wicki et al, 2000; Aujesky et al, 2006; Uresandi et al, 2007; Jiménez et al, 2010; Kline et al, 2012; Den Exter et al, 2013). The model is a validation and extension of the CDR proposed in several clinical trials with the aim of pragmatically selecting low-risk patients eligible for outpatient care (Siragusa et al, 2005; Zondag et al, 2011; Font et al, 2014; Weeda et al, 2016). These decision-making rules are based on the combination of altered vital signs (e.g., hypotension, hypoxaemia, tachycardia, etc.) and factors that point toward a high risk of bleeding or other contraindications to receiving treatment in the home. Moreover, the EPIPHANY rule incorporates another five covariates that include discriminatory characteristics typical in cancer patients that are easily accessible at patients’ bedside, such as ECOG-PS, evaluation of tumour response prior to PE using RECIST 1.1 criteria, previous primary tumour resection, oxygen saturation, and the presence or absence of PE-specific symptoms. All these variables have been widely used in various contexts to predict clinical outcome and there is good reason to think that they are also important in PE (Wicki et al, 2000; Aujesky et al, 2006; Uresandi et al, 2007; Jiménez et al, 2010; Kline et al, 2012; Den Exter et al, 2013).

The ECOG-PS has been largely acknowledged in oncology to predict toxicity and adverse clinical outcomes in various contexts (Oken et al, 1982). In general, functional worsening points to severe underlying pathology, poorer physiological reserve, and decreased mobility, thereby making patients more prone to thrombotic risk (Jiménez et al, 2010; Den Exter et al, 2013). Patients having any risk factor and poor functional status have a worse prognosis than those with a good functional status, particularly when PE diagnosis is incidental. On the other hand, we have detected a new variable that should be incorporated into the CDR: evaluation of tumour response prior to PE based on RECIST radiological criteria, which determines short-term prognosis following PE. Tumours in progression or those at risk for progression because response could not be assessed are at higher risk for complications than those with controlled disease or with no evidence of disease, even in the absence of other prognostic factors. In fact, resection of the primary tumour appears to be the only thing that protects individuals with tumours in progression and no other risk factor facing a complicated clinical course. This variable likely improves prognosis as a consequence of decreasing local complications, such as serious bleeding (Lee et al, 2009). In fact, in this series, bleeding was located in the primary tumour in 43% of the cases, which rose to 50% in subjects who died due to haemorrhage.

Remarkably, some patients diagnosed incidentally displayed PE-specific symptoms upon meticulous anamnesis, as reported by other authors (O’Connell et al, 2006). Therefore, the absence of PE-specific symptoms does not correspond exactly with incidental PE.

The decision tree model classification method used after the Exhaustive CHAID procedure is one of the differences that distinguish the EPIPHANY index from other models (Wicki et al, 2000; Aujesky et al, 2006; Uresandi et al, 2007; Jiménez et al, 2010; Kline et al, 2012; Den Exter et al, 2013). This design was chosen given the interest in generating a classification that would reasonably imitate authentic decision-making. This means that, unlike a binary logistic regression, which postulates the existence of additive effects that contribute to explaining outcome, decision trees factor in the existence of strong interactions between variables and are better suited to elaborating decision-making algorithms that follow the same structure (Yohannes and Hoddinott, 1999; Lewis, 2000). Thus, in the real world, decisions in subjects with PE are not generally made on the basis of the small additive contributions of several variables, but on the presence or absence of strong dichotomous predictors such as cardiogenic shock, acute respiratory failure, hypotension, etc. (Wicki et al, 2000; Aujesky et al, 2006; Uresandi et al, 2007; Jiménez et al, 2010; Kline et al, 2012; Den Exter et al, 2013). The presence of a single one of these variables indicates high risk and is fundamental in the clinical decision to intensify therapy, regardless of the contribution of the remaining covariates of a logistic regression model. Decision trees are also useful in situations having non-linear probable effects for some variables, as is assumed in a sample of patients diagnosed using different methods (CT-angiography scans vs conventional CT) and having dissimilar clinical characteristics, depending on if they are incidental or symptomatic events. Insofar as the previously developed scales are concerned (RIETE, POMPE-C, PESI, etc.) (Aujesky et al, 2006; Jiménez et al, 2010; Kline et al, 2012; Den Exter et al, 2013), we do not know for sure if they can complement these criteria, although a preliminary analysis performed by our group suggests that their use is not likely to be necessary after applying a clinical classification rule (Carmona-Bayonas et al, 2016).

Another striking difference between the EPIPHANY index and the afore-mentioned methods is that we propose beginning to use the probability of serious complications within 15 days as the primary end point and not all-cause 30-day mortality, which had been typically used in other studies (Aujesky et al, 2006; Jiménez et al, 2010; Kline et al, 2012; Den Exter et al, 2013). Of course we agree that mortality is a far more solid outcome; nonetheless, we believe that considering other end points in different clinical situations, as our group recently proposed (Carmona-Bayonas et al, 2016), is justified. One of the arguments is that the appearance of serious complications in individuals with PE treated as outpatients, far from medical supervision, can paradoxically turn low-risk patients into the most vulnerable, because of misclassification. In contrast, the probability of all-cause 30-day mortality will not necessarily affect decision-making regarding ambulatory treatment in some subgroups, as the cause of death is rarely the PE itself, and is often due to cancer progression (Den Exter et al, 2013; Carmona-Bayonas et al, 2016). In fact, patients on palliative care for advanced disease are those in whom it is even more important to prevent unnecessary hospitalisation at the end of their lives. The use of the all-cause 30-day mortality end point also entails the issue of proposing intensification of PE management (e.g., with fibrinolysis) in subjects at greater risk of early mortality due to cancer, who are precisely the ones who are less likely to benefit. For instance, when we examine the causes of death 15, 30, and 90 days after PE, cancer is the cause of death in 35%, 54%, and 65%, respectively. Although determining the cause of death in absence of autopsies has clear limitations, the same data appear in the RIETE registry, in which 50% of the deaths resulted from the cancer itself and not PE (Den Exter et al, 2013).

This study has certain limitations that must be taken into account. First of all, it is a fundamentally retrospective registry of medical history data, with the intrinsic limitations in precision this entails. Nevertheless, most of the events contemplated are solid and are faithfully recorded in the histories (blood pressure, oxygen saturation, documentation of tumour response, ECOG-PS, exitus, etc.). Second, PE is a highly polymorphic pathology and more external validations are needed by other groups, being cognizant that these models offer a general overview of the main risk factors. However, some subjects have other particular factors with a definitive impact on prognosis. Third, decision tree modelling can be weak, unstable predictors in certain contexts. Thus, random forest models that incorporate the prediction of multiple, individual decision trees may perform better, albeit they are also more complicated to interpret and use in daily practice (Breiman, 2001). It is also doubtful that they can achieve a better definition of ‘low risk’. Finally, the assimilation and integration of radiological variables and/or biomarkers (e.g., troponin, pulmonary artery obstruction indices, right ventricular dilatation, etc.) would call for more in-depth studies.

In short, we have elaborated a decision tree to predict serious complications in cancer patients with PE that enables patients to be classified into groups of high, intermediate, and low risk for complications. This model validates and refines the classification rules previously used by other authors; it is based on variables that are easy to obtain; it’s easy to use, and can have potential implications for clinical management.