Background
Cardiac output (CO) monitoring is a mainstay of hemodynamic management in high-risk patients having major surgery and in critically ill patients with circulatory shock [
1,
2]. Numerous technologies are available to measure or estimate CO [
3‐
6]. Thermodilution methods allow CO calculation based on the Stewart-Hamilton principle; after injection of a known amount of indicator the change in indicator concentration downstream in the circulation is related to blood flow [
7‐
9].
Pulmonary artery thermodilution remains the clinical reference method for CO monitoring [
10]. For
intermittent pulmonary artery thermodilution a fluid bolus with known volume and temperature is manually injected into the right atrium through the proximal port of a pulmonary artery catheter (PAC) and subsequent temperature changes over time are detected by an integrated thermistor more distal in the pulmonary artery [
8]. To minimize measurement error and account for cyclic changes in CO throughout the respiratory cycle, CO is calculated based on several consecutive thermodilution CO measurements [
8].
In contrast to intermittent pulmonary artery thermodilution,
continuous pulmonary artery thermodilution enables CO to be measured automatically (i.e., without the need for manual indicator injection) [
11]. PACs for continuous pulmonary artery thermodilution are equipped with a thermal filament heating up the blood in the right ventricle in a random binary sequence [
11]. Changes in blood temperature are detected downstream by an integrated thermistor near the tip of the PAC. Based on the detected blood temperature changes, CO is continuously calculated using a stochastic system identification principle and an averaged CO value is provided by the monitor [
11].
Because both continuous and intermittent pulmonary artery thermodilution are used in clinical practice it is important to know whether CO measurements by the two methods are clinically interchangeable. We, therefore, performed a systematic review and meta-analysis of clinical studies comparing CO measurements assessed using continuous and intermittent pulmonary artery thermodilution.
Methods
Study design and registration
In accordance with the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statement [
12] we performed a systematic review and meta-analysis of clinical studies comparing continuous pulmonary artery thermodilution-derived CO measurements (CO
cont; test method) with intermittent pulmonary artery thermodilution-derived CO measurements (CO
int; reference method) in adult patients having surgery or critically ill patients treated in the intensive care unit. This systematic review and meta-analysis was registered in the International Prospective Register of Systematic Reviews (PROSPERO; registration number CRD42020159730).
Eligibility criteria
For this systematic review and meta-analysis, we considered studies published in English between January 1st, 1975 and December 31st, 2019 comparing COcont and COint in adult (age ≥ 18 years) surgical or critically ill patients that report extractable or calculable mean of the differences between COcont and COint with corresponding standard deviation (SD) and/or 95%-limits of agreement (95% LOA). We did not consider correspondences or case reports.
The electronic databases PubMed, Web of Science, and the Cochrane Library were systematically searched using a priori defined search strategies. As an example, the full electronic search strategy for PubMed is provided in Additional file
1. Further, the reference lists of the identified studies and the reference lists of previous reviews were searched to find additional eligible studies that had not been identified during the initial systematic database search.
Study selection
Titles and abstracts of all identified studies were screened by three investigators (PH, MF, BS). The full-text of potentially eligible studies was used to assess study eligibility based on the above-mentioned predefined eligibility criteria. Discrepancies were resolved by discussion among the three investigators.
Data collection process and data items
Four different investigators (KK, AB, CV, LB) independently extracted the data from the included studies and data were checked for consistency. Discrepancies were discussed and resolved based on the original data. We extracted data on the results of comparative statistics, i.e., the mean of the differences between CO
cont and CO
int with SD, 95% LOA, and the percentage error (PE) [
13]. We report the mean of the differences between CO
cont and CO
int as CO
cont − CO
int. We re-calculated the mean of the differences for studies reporting the mean of the differences as CO
int − CO
cont accordingly. If not provided in the studies, the SD of the mean of the differences was re-calculated as (upper 95% LOA − mean of the differences)/1.96. For studies not providing the PE but reporting mean CO
cont and mean CO
int, the PE was calculated as (1.96
⋅ SD of the mean of the differences)/(mean of CO
cont and CO
int).
In addition to the results of comparative statistics, we extracted data regarding the study setting (operating room or intensive care unit), the patient population, the number of patients, the total number of measurement pairs, and the year of publication.
Risk of bias in individual studies
Based on the Quality Assessment of Diagnostic Accuracy Studies guidelines (QUADAS-2) [
14] we used an adapted questionnaire (Additional file
2) to assess study quality by objectively performing judgments on bias and applicability of the included studies [
14‐
16]. Risk of bias classification is based on different signaling questions of different domains that were marked with “yes”, “no” or “unclear” which finally results in classifying these domains as “low”, “high” or “unclear” risk of bias. Concerns about applicability of the included studies were rated as “low”, “high” or “unclear”. An independent quality assessment of each included study was performed by three investigators (KK, AB, LB) and discrepancies were resolved by discussion among the three investigators.
Principle summary measures
The mean of the differences between COcont and COint of the individual studies is the principal summary measure of the current meta-analysis. We used a random effects model for means as outcomes with restricted maximum likelihood as the estimator to summarize the mean of the differences, the SD of the mean of the differences, and the sample size. This random effects model derives a pooled estimate of the mean of the differences that represents the trueness/accuracy of COcont compared to COint.
For each study, we calculated the 95%-confidence interval (95% CI) for the reported/calculated mean of the differences between COcont and COint as 1.96 ⋅ standard error of the mean (SD/√sample size) to account for study sample size. We summarized these 95% CIs with the random effects model and report the resulting overall random effects model-derived pooled estimate of the 95% CI.
Further, we report overall random effects model-derived pooled estimates of 95% LOA.
We summarized the PE using a random effects model for proportions with DerSimonian-Laird as the estimator [
17] and report the overall random effects model-derived pooled estimate of the PE with 95% CI. We defined clinical interchangeability between CO
cont and CO
int based on the established 30% PE threshold [
13]. Heterogeneity and inconsistency were assessed by means of Cochran’s Q and I
2.
Synthesis of results
The database includes all relevant data to perform the meta-analysis. To obtain overall random effects model-derived pooled estimates, a random effects model was computed for each outcome. We reported Cochran’s Q as a measure of heterogeneity and I2 as a measure of consistency.
Risk of publication bias across studies
We calculated funnel plots with corresponding Eggers regression tests for asymmetry to address the potential problem of selective reporting [
18].
Subgroup analyses, additional analyses
We performed subgroup analyses considering the factors "setting" (operating room and intensive care unit) and “patient population” (liver transplantation and cardiac surgery).
Additionally, we investigated the relation between the mean of the differences between COcont and COint from individual studies and a) the reported mean COint and b) the year of publication.
Statistical software
We used the software R version 4.0.2 (R Foundation for Statistical Computing. Vienna, Austria) with the R-package metafor version 2.4–0 for statistical analyses [
19].
Discussion
In this meta-analysis of clinical studies comparing COcont and COint in adult surgical and critically ill patients, the heterogeneity across studies was high. The overall random effects model-derived pooled estimate of the mean of the differences between COcont and COint was 0.08 L/min with pooled 95% LOA of − 1.68 to 1.85 L/min and a pooled PE of 29.7 (95% CI 20.5 to 38.9)%.
In CO method comparison studies, the agreement between a test and a reference method is described by the trueness (often called “accuracy”) and precision of agreement [
74‐
76] based on Bland–Altman analysis [
77‐
79]. In Bland–Altman plots, the difference between measurements with a test and a reference method is plotted against the mean of the two measurements [
77‐
79]. The mean of the differences (often called “bias”) reflects the trueness of test method measurements, the SD and 95% LOA of the mean of the differences reflect the precision of agreement [
74‐
76]. The PE is used frequently in CO method comparison studies to characterize the precision of agreement; the PE is 1.96 SD of the mean of the differences between measurements divided by the mean value of all measurements [
13]. In their landmark study, Critchley et al. proposed 28.3%, rounded up to 30%, as the PE threshold defining interchangeability [
13]. Nevertheless, one should keep in mind that the PE threshold of 28.3% is based on the assumption that the precision of method of both the test method and the reference method are 20%. Because the precision of method is not exactly known, using a 30% PE threshold may lead to misinterpretations concerning the clinical interchangeability of CO
cont and CO
int.
In this meta-analysis, the overall random effects model-derived pooled estimate of the mean of the differences between CO
cont and CO
int was < 0.1 L/min—which is less than a 2% difference for an average adult CO of 5 to 6 L/min. This meta-analysis thus suggests a good trueness/accuracy of CO
cont compared with CO
int when looking at the overall pooled mean of the differences. However, a low pooled mean of the differences in meta-analyses can be misleading because averaging study results with negative and positive means of the differences of similar absolute amount can result in a very low pooled mean of the differences despite marked measurement differences in single studies. In this meta-analysis, studies reporting an overestimation and those reporting an underestimation of CO
cont compared to CO
int neutralized each other, as illustrated in Fig.
2.
Regarding the precision of agreement between CO
cont and CO
int this meta-analysis revealed that the pooled 95% LOA of the mean of the differences between CO
cont and CO
int were − 1.68 to 1.85 L/min. The overall random effects model-derived pooled estimate of the PE was 29.7 (95% CI 20.5 to 38.9)%—thus suggesting that CO
cont barely passes interchangeability criteria with CO
int [
13]. However, the PE was only available for half of all studies because the PE per se or mean CO values necessary for post-hoc PE calculation were not always reported. Nevertheless, 95% CIs were similar in studies with reported or calculable PE and studies where the PE was not reported or calculable suggesting that the PE for all studies would probably also be close to 30%.
This meta-analysis showed a large variability in results between studies, with means of the differences reported in single studies ranging from − 0.79 to 1.00 L/min and PEs ranging from 4.8 to 89.3%. This variability strongly suggests that the measurement performance of CO
cont is influenced by various factors, that may include patient characteristics, the clinical setting, and cardiovascular dynamics. Even subgroups of studies were heterogeneous. For example, the “operating room” subgroup included patients having different types of surgery, the “intensive care unit” subgroup included patients with and without circulatory shock requiring different vasopressor and inotropic support, and the “cardiac surgery” subgroup included patients studied either during or after surgery. It is important to bear in mind that the measurement performance is context-sensitive when interpreting validation studies of any CO monitoring system [
80].
Intermittent pulmonary artery thermodilution remains the clinical reference method for CO monitoring and therefore is frequently used as the “gold standard” in method comparison studies [
10]. Continuous pulmonary artery thermodilution offers the opportunity to measure CO automatically without the need for manual indicator injection, thus reducing contamination risk and saving time [
81]. Although “continuous” suggests that this PAC technology provides real-time CO measurements, it actually provides “semi-continuous”, averaged CO values [
11,
81]. The averaging procedure improves the signal-to-noise ratio but may cause a time delay of up to several minutes. This time delay may become relevant when hemodynamics change rapidly, e.g., during dynamic tests such as passive leg raising and during therapeutic interventions such as fluid or vasopressor administration [
8,
82,
83].
In today’s clinical practice, PACs are mainly used in patients having cardiac surgery, liver transplantation, and in critically ill patients with circulatory shock, especially with right ventricular dysfunction [
10,
84]. Using a PAC allows monitoring of CO, mixed venous oxygen saturation, and intravascular pressure and thus provides important information on cardiovascular dynamics [
85]. There is nonetheless an ongoing debate on whether or not PACs still have a place in daily clinical practice [
86‐
88]. Some trials showed no clinical benefit of using the PAC without treatment protocols in critically ill patients [
89,
90] or cardiac surgery patients [
91]. Additionally, there are now various methods to measure or estimate CO less invasively or even non-invasively [
3,
6]. The clinical use of the PAC thus decreased over the last years in critically ill patients and in surgical patients [
92,
93].
Although intermittent and continuous pulmonary thermodilution methods are widely used, we are not aware of any meta-analysis investigating the overall agreement between the two methods. In contrast, several meta-analyses have already been published for Doppler [
94,
95], bioimpedance [
15,
94], as well as invasive and non-invasive pulse contour methods [
15,
16,
94]. They all reported pooled PE values ranging between 40 and 50%.
We only investigated the absolute agreement between COcont and COint and did not analyze the trending ability of COcont. The ability to track changes in CO is actually the main expectation clinicians may have from a continuous monitoring system over an intermittent technique. Unfortunately, most studies of this meta-analysis did not report concordance rates or polar plots, so that we were unable to assess the ability of continuous pulmonary thermodilution to track changes in CO. Furthermore, several studies [19 of 54 (35%)] had a risk of bias classification of “unclear” or “high” that may further influence the final results of this meta-analysis. About half of the included studies [26 of 54 (48%)] were performed before the year 2000, and only 6 (11%) studies after 2010.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.