Introduction
In the Emergency Department (ED) about 40% of patients admitted to hospital [
1] may be suspected of infection. For these patients the decisions of immediate interest for the diagnostic work-up are whether a blood culture should be drawn and how many resources the microbiology lab should spend on providing a rapid answer. Rapid microbiology diagnostics decrease time to identification of pathogens and potentially enable earlier initiation of targeted antimicrobial therapy improving antimicrobial stewardship programs [
2]. For example, when a blood culture becomes positive, identification of pathogens by MALDI-TOF MS directly from positive blood cultures are now routine in many labs. Sub-species typing, and detection of drug resistance determinants besides microbial identification from isolated colonies are also being explored for MALDI-TOF MS [
3]. Another decision is whether blood cultures should be supplemented by a more expensive, but much faster, method based on direct-from-blood PCR (dfbPCR) [
4]. Mangioni et al. proposed the use of multiparemeter scores to triage patients for rapid diagnostic procedures [
5], where scores which can predict the likelihood of a useful answer (probability of bacteraemia) and the need for rapid result (high probability of mortality) may be useful. Ordering blood cultures without considering the pretest probability may be both wasteful and harmful [
6].
Clinical scores which can predict the probability of bacteraemia such as those described in two reviews [
7,
8] can help make these decisions, including whether dfbPCR should supplement blood culture in some patients [
9].
Several clinically validated scores have been used to predict mortality in ED patients, such as the National Early Warning Score (NEWS) [
10] and the Mortality in Emergency Department Sepsis (MEDS) [
11]. The Systemic Inflammatory Response Syndrom (SIRS) [
12] was established to define operational criteria for a sepsis diagnosis. SIRS has been replaced by the Sequential Organ Failure Assessment (SOFA) score or by the quick-(q-) SOFA score in the Sepsis-3 consensus definition of sepsis [
13]. The Shapiro Decision Rule (SDR) predicts bacteraemia for ED patients [
14] as does SepsisFinder (SF) [
15].
The primary objective of this study is to retrospectively compare predictions of 30-day mortality and bacteraemia from all of these scores: SF, NEWS, SOFA, MEDS, qSOFA, SDR and SIRS. All scores will be assessed based on their Area Under the Receiver Operating Characteristic (AUROC) curves.
The review by Coburn et al. [
7] focuses on overuse of blood culture in low-risk patients, which may be due to an overestimation of the probability of bacteraemia by physicians [
16]. They conclude that both SIRS and SDR perform well in identifying a low risk group which may not need blood culture. Pawlowicz et al. [
17] found a 33.5% reduction in the number of ordered blood cultures after implementation of SDR. Another evaluation of SDR found that it was able to select a group of 45% of all patients that had a bacteraemia rate of only 0.9% [
18].
In line with these studies, a secondary objective of this study will be to compare how well each of the scores can identify a low-risk group, consisting of about one third of the patients, where blood culture may be of limited value. In addition we will identify a high-risk group, consisting of 10% of the patients, where dfbPCR may be justifiable, despite its relatively high cost.
Methods
Patient data
The three test datasets will be referred to as HvH, SLB and TREAT04.
HvH
263 patients with suspected sepsis at Hvidovre Hospital, Hvidovre, Denmark; November 2011 to April 2012 [
19].
SLB
199 patients with suspected sepsis at Lillebælt Hospital, Vejle, Denmark; July to August 2012 [
20].
TREAT04
1354 patients admitted to a department of medicine with suspected community acquired infections at Rabin Medical Center, Petach Tikva, Israel. Data were collected in an interventional study of TREAT from May to November 2004 [
21].
SF predictions
SF [
15] is a CPN (Causal Probabilistic Net or Bayesion Net) model of part of the inflammatory response. It uses age, temperature, heart rate, calculated mean arterial pressure, mental status, neutrophil fraction, platelets, CRP, lactate, creatinine and albumin as input variables. The outputs from SF are 30-day mortality and the probability of bacteraemia. It is an inherent part of the CPN technology that SF tolerates missing values well. Input data for calculation of the SF prediction of bacteraemia will therefore be considered “complete” if any three out of the 11 possible input variables are available. Age is not used for the prediction of bacteraemia. The prediction of mortality uses the same input variables as the bacteraemia prediction, plus age as an independent factor [
22]. The SF CPN was implemented in Hugin (version 8.7, Hugin Expert A/S), commercially available software for constructing and using CPNs. SF was trained on one dataset and tested on three independent datasets [
15]. These three datasets (HVH, SLB and TREAT04) will also be used in this study in the comparison of performance between SF and the other clinical scores.
Clinical scores
Scores commonly used to aid diagnosis and/or prognosis in patients with suspected sepsis were included: NEWS, SOFA, MEDS, qSOFA, SDR and SIRS. The data items required for calculation of the clinical scores are given in Table
1. To best accommodate the data requirements of the different scores, data were mapped when required and possible.
Table 1
Data items used to calculate the clinical scores
Vital signs |
Systolic BP or MAP | x | x | x | | x | x | |
Heart rate | x | x | | | | | x |
Temperature | x | x | | | | x | x |
Chills | x | | | | | x | |
Mental status | x | x | | x | x | | |
GCS |
Arterial O2 saturation (SaO2) | | x | | | | | |
Arterial O2 pressure (PaO2) | | | x | | | | |
Respiratory rate | | x | | | x | | x |
Respiratory distress | | | | x | | | |
Laboratory |
WBC | | | | | | x | x |
Platelets | x | | x | x | | x | |
Creatinine | x | | x | | | x | |
Neutrophil fraction | x | | | | | | |
Immature neutrophils | | | | x | | x | |
Bilirubin | | | x | | | | |
Albumin | x | | | | | | |
CRP | x | | | | | | |
Lactate | x | | | | | | |
Other/comorbidities |
Age | x | | | x | | x | |
Nursing home residence | | | | x | | | |
Lower respiratory infection | | | | x | | | |
Septic shock | | | | x | | | |
Terminal illness | | | | x | | | |
Suspected endocarditis | | | | | | x | |
Indwelling vascular catheter | | | | | | x | |
Vomiting | | | | | | x | |
Overall score availability (%) | 99.8 | 33.9 | 23.5 | 69.1 | 50.8 | 61.8 | 58.5 |
Calculated/mapped variables
Glasgow coma scale (GCS) was not available in the datasets. However, mental status was recorded as normal, confused or comatose. Normal was mapped to alert (GCS = 15), both confused and comatose were mapped to not alert (GCS < 15). PaO2 was less widely recorded than SaO2, so to give additional availability for SOFA which requires PaO2, a mapping was made from SaO2. Respiratory distress as used in MEDS was calculated if at least one of respiratory rate and SaO2 were present in the dataset.
Adjustments to the scores
Other than the use of the mapped variables described, no adjustments to the scoring methods were made for NEWS, qSOFA and SIRS. Terminal illness and immature neutrophils were not recorded in any of the datasets. Therefore MEDS was calculated assuming these variables did not contribute to the scores in patients where they were missing. Adjustments were also made to the SOFA score: we did not require evidence of mechanical ventilation for the respiratory component, the maximum score of the cardio component was 1 due to lack of information on vasopressors, the maximum CNS score was 1, using alert/not alert as the GCS score was not available and the renal component was calculated without use of urine output.
Completeness
For NEWS, MEDS, SOFA, qSOFA and SIRS the scores were only calculated if the data for the patients were complete. Data were considered complete where all of the variables used in the adjusted scores were present. SF was used with incomplete data, provided at least three of the 11 possible variables for SF were available.
Microbiology
Bacteraemia was defined as positive blood cultures with one or more clinically significant pathogen. Bacillus spp. (except B. anthracis), coagulase-negative staphylococci (CoNS), Corynebacterium spp. and Micrococcus spp. were considered contaminants in the absence of other clinical evidence.
Outcomes and statistical analysis
The primary outcomes were bacteraemia and all-cause 30-day mortality. Predictive performance was assessed by AUROC. AUROCs were compared using the method of De Long [
23] as implemented in the pROC package of R (R version 3.5). To simulate possible clinical scenarios, two cut-offs were determined for each score that would result in a low-risk group of approximately one third of patients, and a high risk group of approximately 10% of patients. Outcomes in each risk group were assumed to be binomially distributed. Confidence intervals for binomial proportions were calculated under the assumption of normality. Analyses were performed in R (version 3.5) and Python (version 3.7), visualizations were constructed using Matplotlib [
24].
Discussion
For the combined dataset SF obtained mortality AUROCs, calculated from cases with complete data, of 0.775 which was higher than for NEWS (0.734) and SOFA (0.721) and significantly higher than for MEDS, qSOFA, SIRS and SDR.
For the combined dataset SF obtained bacteraemia AUROCs of 0.745, higher than for SDR (0.743), SOFA (0.719) and NEWS (0.694) and significantly higher than for MEDS, qSOFA and SIRS.
SF could identify a low risk group, consisting of about one third of the patients. In that group the bacteraemia rate was 1.7% and the average price of obtaining one positive blood culture was quite high, € 1976.
SF could also identify a high risk group, consisting of 10% of the patients. In that group the bacteraemia rate was 25.3%. The cost of obtaining one positive identification of a pathogen by dfbPCR was estimated to € 502, despite the relatively high cost of dfbPCR. Interestingly this cost is substantially lower than the cost of obtaining a positive blood culture in the low risk group.
The study was based on three data sets HVH, SLB and TREAT04. These data sets have the strength that they are diverse. They were collected over almost a decade, in countries with high and low antimicrobial resistance, with a large variation in the amount and type of data collected and with substantial differences in mortality. This demonstrated the robustness of both SF and the clinical scores in the sense that they all showed uniform performance across these differences.
These differences also gave some weaknesses of the study: Although the scores seem to be able to stratify the patients across the differences, it may prove necessary to adjust cut-off values to adapt to the dataset at hand. Another weakness of the datasets was that in many patients only some of the scores could be calculated. This weakened the data, which already suffered the limitation of the small size of the Danish datasets. It does, however, highlight the tolerance of SF to missing data, since SF could be applied for virtually all data in the data sets.
The age of the data is also a weakness, since data on sepsis markers as procalcitonin and CRP were either absent or scarce in the oldest of the datasets, TREAT04. CRP is one of the stronger sepsis markers in the dataset used to train the SF model and although SF performs better than any single data item [
15] it is to be expected that more CRP measurements would have improved the performance of SF. This may be even more true of procalcitonin.
In the literature AUROCs were found for SOFA and qSOFA for in-hospital mortality in a large validation dataset: AUC = 0.74 (all) and 0.79 (non-ICU) for SOFA and 0.66 (all) and 0.81 (non-ICU) for qSOFA [
27]. Similar results are observed for recent studies outside the ICU with AUROC ranging from 0.77–0.83 for SOFA [
28‐
32] and 0.63–0.77 for qSOFA [
29,
30,
33‐
35]. MEDS is also a predictor of mortality.It had an AUC of 0.82 and 0.76 for its derivation and validation cohorts, respectively [
11], although significant variability has been seen in the literature with AUC ranging from 0.67–0.77 in five recent studies [
36‐
41]. NEWS also performs well as a predictor of mortality, with reported AUROC between 0.67–0.78 [
30,
32,
35,
42].
Use of standard clinical scores as predictors of bacteraemia is not well reported in the literature. A review identified several validated models, including SDR, although noted that very few scores for bacteraemia were prospectively validated and performed well, and none were in routine clinical use [
8]. The other scores included in the analysis have not been evaluated specifically for prediction of bacteraemia outside of isolated studies. In one study, qSOFA showed some potential in a subgroup of elderly patients, however the overall AUROC was 0.64 [
43]. The same study reported an overall AUROC of 0.60 for SIRS.
The clinical applications discussed in the paper may deserve a health-economic evaluation. The cost estimates for the low risk group indicate that blood culture from a low risk group may not be cost-effective, in particular because the testing of this group gave rise to 3.7% false positive blood cultures, which is higher than the 1.7% true positive blood cultures. As noted by Bates et al. [
6] these are presumably associated with substantially increased cost due to increased length af stay (4.5 days) and increased consumption of antibiotics (39%) and the true costs of contaminants may greatly exceed those of the test itself.
In contrast, dfbPCR from high risk patients may be cost effective in terms of a rapid diagnosis. Realloction of resources currently spent on blood cultures from low risk patients to dfbPCR from high risk patients may be a cost neutral way of improving the quality of microbiological services. However, a prospective randomized clinical outcome study is warranted in order to routinely apply any risk assessment tool for eliminating any currently applied diagnostic intervention in any patient group, including the omittance of blood culture in a patient population scoring low on sepsis risk.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.