Scolaris Content Display Scolaris Content Display

Point‐of‐care ultrasonography for diagnosing thoracoabdominal injuries in patients with blunt trauma

Collapse all Expand all

Abstract

Background

Point‐of‐care sonography (POCS) has emerged as the screening modality of choice for suspected body trauma in many emergency departments worldwide. Its best known application is FAST (focused abdominal sonography for trauma). The technology is almost ubiquitously available, can be performed during resuscitation, and does not expose patients or staff to radiation. While many authors have stressed the high specificity of POCS, its sensitivity varied markedly across studies. This review aimed to compile the current best evidence about the diagnostic accuracy of POCS imaging protocols in the setting of blunt thoracoabdominal trauma.

Objectives

To determine the diagnostic accuracy of POCS for detecting and excluding free fluid, organ injuries, vascular lesions, and other injuries (e.g. pneumothorax) compared to a diagnostic reference standard (i.e. computed tomography (CT), magnetic resonance imaging (MRI), thoracoscopy or thoracotomy, laparoscopy or laparotomy, autopsy, or any combination of these) in patients with blunt trauma.

Search methods

We searched Ovid MEDLINE (1946 to July 2017) and Ovid Embase (1974 to July 2017), as well as PubMed (1947 to July 2017), employing a prospectively defined literature and data retrieval strategy. We also screened the Cochrane Library, Google Scholar, and BIOSIS for potentially relevant citations, and scanned the reference lists of full‐text papers for articles missed by the electronic search. We performed a top‐up search on 6 December 2018, and identified eight new studies which may be incorporated into the first update of this review.

Selection criteria

We assessed studies for eligibility using predefined inclusion and exclusion criteria. We included either prospective or retrospective diagnostic cohort studies that enrolled patients of any age and gender who sustained any type of blunt injury in a civilian scenario. Eligible studies had to provide sufficient information to construct a 2 x 2 table of diagnostic accuracy to allow for calculating sensitivity, specificity, and other indices of diagnostic test accuracy.

Data collection and analysis

Two review authors independently screened titles, abstracts, and full texts of reports using a prespecified data extraction form. Methodological quality of individual studies was rated by the QUADAS‐2 instrument (the revised and updated version of the original Quality Assessment of Diagnostic Accuracy Studies list of items). We calculated sensitivity and specificity with 95% confidence intervals (CI), tabulated the pairs of sensitivity and specificity with CI, and depicted these estimates by coupled forest plots using Review Manager 5 (RevMan 5). For pooling summary estimates of sensitivity and specificity, and investigating heterogeneity across studies, we fitted a bivariate model using Stata 14.0.

Main results

We included 34 studies with 8635 participants in this review. Summary estimates of sensitivity and specificity were 0.74 (95% CI 0.65 to 0.81) and 0.96 (95% CI 0.94 to 0.98). Pooled positive and negative likelihood ratios were estimated at 18.5 (95% CI 10.8 to 40.5) and 0.27 (95% CI 0.19 to 0.37), respectively. There was substantial heterogeneity across studies, and the reported accuracy of POCS strongly depended on the population and affected body area. In children, pooled sensitivity of POCS was 0.63 (95% CI 0.46 to 0.77), as compared to 0.78 (95% CI 0.69 to 0.84) in an adult or mixed population. Associated specificity in children was 0.91 (95% CI 0.81 to 0.96) and in an adult or mixed population 0.97 (95% CI 0.96 to 0.99). For abdominal trauma, POCS had a sensitivity of 0.68 (95% CI 0.59 to 0.75) and a specificity of 0.95 (95% CI 0.92 to 0.97). For chest injuries, sensitivity and specificity were calculated at 0.96 (95% CI 0.88 to 0.99) and 0.99 (95% CI 0.97 to 1.00). If we consider the results of all 34 included studies in a virtual population of 1000 patients, based on the observed median prevalence (pretest probability) of thoracoabdominal trauma of 28%, POCS would miss 73 patients with injuries and falsely suggest the presence of injuries in another 29 patients. Furthermore, in a virtual population of 1000 children, based on the observed median prevalence (pretest probability) of thoracoabdominal trauma of 31%, POCS would miss 118 children with injuries and falsely suggest the presence of injuries in another 62 children.

Authors' conclusions

In patients with suspected blunt thoracoabdominal trauma, positive POCS findings are helpful for guiding treatment decisions. However, with regard to abdominal trauma, a negative POCS exam does not rule out injuries and must be verified by a reference test such as CT. This is of particular importance in paediatric trauma, where the sensitivity of POCS is poor. Based on a small number of studies in a mixed population, POCS may have a higher sensitivity in chest injuries. This warrants larger, confirmatory trials to affirm the accuracy of POCS for diagnosing thoracic trauma.

Plain language summary

How accurate is bedside ultrasound for the diagnosis of injuries to the abdomen or chest in patients with blunt injuries?

Background and aims

People who sustain a road traffic crash or fall from a height are at risk for blunt body trauma (i.e. non‐penetrating trauma) and multiple injuries. Medical professionals caring for these patients in hospital need to know if vital organs or vessels are damaged, and whether there is any major bleeding that requires immediate intervention. Point‐of‐care sonography (POCS), a form of ultrasound, is a non‐invasive, radiation‐free, portable imaging technique that can be used at the patient's bedside. It is frequently used to help diagnose injuries in the emergency department. We reviewed the best scientific evidence about the accuracy of POCS, that is its ability to identify or exclude injuries correctly, compared to other diagnostic tests. We considered computed tomography, laparotomy, and autopsy to be good comparative tests against which to measure the accuracy of POCS.

Study characteristics

We searched for studies from the year in which the first paper about using ultrasound to diagnose trauma patients was published until 15 July 2017. We considered 2296 records and included 34 relevant studies that involved 8635 participants in this review. All 34 studies were published between 1992 and 2017, with the number of participants in each study ranging from 51 to 3181. Ten studies included only children, two studies only adults, and the remaining 22 studies included both children and adults.

Quality of the evidence

In many studies, important information about the selection of participants and choice of the diagnostic tests against which to compare POCS was not reported. We therefore rated the methodological quality of the available evidence mostly as unclear.

Key results

Point‐of‐care sonography had a sensitivity (i.e. the ability to detect a person with the disease) of 74% and a specificity (i.e. the ability to exclude a person without the disease) of 96%. Sensitivity and specificity varied considerably across studies, which was due in part to variation in study, participant, and injury characteristics. In children, both the sensitivity and specificity of POCS were lower than in an adult or mixed population, meaning that POCS was less able to identify or rule out an injury. Based on our results, we would expect that amongst 1000 patients of a mixed‐age population with suspected blunt trauma to the abdomen or chest, POCS would miss 73 patients with injuries, and would falsely suggest the presence of injuries in 29 patients who were unaffected. This result emphasises the need for additional imaging in trauma patients for whom POCS shows no injuries (i.e. a negative result), to check whether they are really injury‐free.

Authors' conclusions

available in

Implications for practice

Following the 'treat first what kills first' principle, any active non‐compressible bleeding in the major body cavities and retroperitoneum represents a priority condition to be addressed immediately (e.g. by pelvic stabilisation, haemostatic transfusion, tranexamic acid, etc.). As damage‐control resuscitation often needs a substantial number of precious packed red blood and fresh frozen plasma units, the high specificity of point‐of‐care sonography (POCS) (0.96, 95% confidence interval (CI) 0.94 to 0.98) may avoid a waste of resources (which is of particular importance in mass casualties), overtreatment, and unnecessary invasive procedures, as false‐positive findings are very unlikely (with 266 false positives (i.e. 3.1%) out of 8635 individuals). Also, the accuracy of ultrasonography for identifying chest injuries such as pneumothorax with a sensitivity calculated at 0.96 (95% CI 0.88 to 0.99) and a specificity calculated at 0.99 (95% CI 0.97 to 1.00) based on four studies is remarkable and may replace traditional posteroanterior radiographs.

However, despite the advantage of the high specificity of ultrasonography, multiprofessional trauma care teams need to be aware that a negative examination bears a relevant risk of being false negative (i.e. negative predictive value (0.90, 95% CI 0.87 to 0.93)). Where there is a high prior probability of thoracoabdominal trauma (e.g. because of the injury mechanism), a negative scan may also be caused by centralised circulation and limited arterial perfusion of injured solid organs like the liver or spleen.

Again, it remains important to consider the individual clinical scenario when interpreting POCS findings. While positive results will be almost always trustworthy and should prompt bleeding control measures, negative scans must be confirmed by a reference test like computed tomography (CT), or, in the case of limited resources, by sequential sonograms and clinical observation. This is of particular importance in paediatric trauma, where the sensitivity of POCS is extremely poor (0.62, 95% CI 0.47 to 0.75), potentially resulting in 118 children with missed injuries in a cohort of 1000 children with suspected blunt thoracoabdominal trauma. These accuracy patterns are probably a signature feature of POCS that cannot be overcome even by state‐of‐the art equipment.

Implications for research

In high‐income countries, the availability of fast CT scanners right at or close to the trauma bay, together with whole‐body scanning protocols and dose‐reducing algorithms, have substantially reduced the clinical importance of POCS in routine trauma care. Additional studies on the accuracy of this technology to detect abdominal injuries may thus have little impact on care processes in all age groups. More accurate reporting of individual study characteristics (e.g. selection of participants, examiner's experience) would help to evaluate potential sources of heterogeneity in the diagnosis of blunt thoracoabdominal trauma better, and to assess the risk of bias. Nonetheless, more and robust data from larger, confirmatory studies using CT as the ultimate reference test are required to define the role of POCS for detecting pneumothorax and haematothorax and facilitating early tube thoracostomy. Studies determining the accuracy and utility of POCS in mass casualty and low‐ and middle‐income countries are needed, however guaranteeing consistent confirmation of POCS findings by objective reference tests in these settings will be challenging.

Summary of findings

Open in table viewer
Summary of findings 1.

Population

Patients of any age and gender who sustained any type of blunt injury in a civilian scenario

Setting

Clinical evaluation at hospitals of any care level

Index test

Point‐of‐care sonography (POCS) as the primary imaging tool

Reference standard

Computed tomography (CT), magnetic resonance imaging (MRI), laparotomy, laparoscopy, thoracotomy, thoracoscopy, autopsy

Findings

  1. POCS emerged as an integral part of trauma algorithms, and remains the point‐of‐care imaging tool of choice for screening for thoracoabdominal bleeding in most regions of the world.

  2. Determining the diagnostic accuracy of POCS in patients with blunt trauma may provide clinicians with valuable information on the likelihood of chest and abdominal injuries and may contribute to decision making regarding the performance of subsequent diagnostic tests.

Limitations

  1. Methodological quality was hampered by severe under‐reporting in the included studies. We assessed risk of bias as unclear in more than half of the studies for the domains of patient selection and reference standard, and in one‐third of the studies for the index test.

  2. There was substantial heterogeneity among the results of the individual studies, which we investigated further by sources of heterogeneity (see summary of findings Table 2).

No. of participants (studies)

Summary sensitivity (95% CI)

Summary specificity (95% CI)

Summary LR+ (95% CI)

Summary LR‐ (95% CI)

Positive predictive value (95% CI)

Negative predictive value (95% CI)

Consequences in a virtual cohort of 1000a

Missed injuries

Overtreated

8635

(34)

0.74

(0.65 to 0.81)

0.96

(0.94 to 0.98)

18.5

(10.8 to 40.5)

0.27

(0.19 to 0.37)

0.88

(0.81 to 0.94)

0.90

(0.87 to 0.93)

73

(If 280 people suffer an injury through trauma, 207 will be identified as injured, and 73 will be missed.)

29

(If 720 people do not suffer an injury through trauma, 29 will be treated as though they had been injured, i.e. overtreated.)

Sensitivity analysis with a children‐only cohort

1384

(10)

0.62

(0.47 to 0.75)

0.91

(0.81 to 0.96)

6.9

(2.5 to 18.8)

0.42

(0.26 to 0.65)

0.76

(0.53 to 0.89)

0.84

(0.77 to 0.90)

118

(If 310 children suffer an injury through trauma, 192 will be identified as injured, and 118 will be missed.)

62

(If 690 children do not suffer an injury through trauma, 62 will be treated as though they had been injured, i.e. overtreated.)

aThe median prevalence was 28% for the complete study population and 31% for the children‐only cohort.

Abbreviations

CI: confidence interval
LR+: positive likelihood ratio
LR‐: negative likelihood ratio

Open in table viewer
Summary of findings 2. Investigation of heterogeneity

Investigation of heterogeneity

Number of studies

Summary sensitivity (95% CI)

Summary specificity (95% CI)

Chi2a

P valueb

Reference standard

Single CT

25

0.75

(0.63 to 0.84)

0.97

(0.93 to 0.98)

0.18 (overall)

0.9160 (overall)

CT plus laparotomy

7

0.73

(0.58 to 0.84)

0.95

(0.87 to 0.98)

Target condition

Limited to free fluid/free air

22

0.78

(0.68 to 0.85)

0.97

(0.96 to 0.99)

9.10 (overall)

0.06 (sensitivity)

8.08 (specificity)

0.0106 (overall)

0.8100 (sensitivity)

0.0045 (specificity)

Free fluid/free air and

organ injuries/vascular lesions

7

0.80

(0.73 to 0.85)

0.88

(0.70 to 0.96)

Age of participant

Children

10

0.63

(0.46 to 0.77)

0.91

(0.81 to 0.96)

7.32 (overall)

54.91 (sensitivity)

19.88 (specificity)

0.0258 (overall)

0.0000 (sensitivity)

0.0000 (specificity)

Adults/mixed

24

0.78

(0.69 to 0.84)

0.97

(0.96 to 0.99)

Type of injury

Abdominal injury

27

0.68

(0.59 to 0.75)

0.95

(0.92 to 0.97)

17.36 (overall)

13.22 (sensitivity)

5.39 (specificity)

0.0002 (overall)

0.0003 (sensitivity)

0.0202 (specificity)

Thoracic injury

4

0.96

(0.88 to 0.99)

0.99

(0.97 to 1.00)

aLarge values of the Chi2 statistic indicate that test performance may be associated with the particular covariate.

bP values < 0.05 indicate statistical evidence that sensitivity and/or specificity differ between the examined groups.

Abbreviations

CI: confidence interval
CT: computed tomography

Background

available in

Target condition being diagnosed

Trauma, including multiple trauma (defined by an Injury Severity Score (ISS) ≥ 16, or, according to the new Berlin definition, by an Abbreviated Injury Scale (AIS) ≥ three for two or more different body regions and one or more additional variables from five physiologic parameters) (Pape 2014), remains a major cause of death and disability worldwide. Severe trauma mainly results from road traffic crashes and falls from a height. In 2010, according to data from the World Health Organization (WHO) Global Burden of Disease Project, motor vehicle crashes ranked eighth in the global death toll (Lozano 2012), and tenth in all sources of disability‐adjusted life years (Murray 2012). The WHO and United Nations Decade of Action for Road Safety campaign 2011 to 2020 was launched to raise awareness about this public health concern and to implement simple and effective primary prevention measures.

A 'treat first what kills first' strategy is now in place at most trauma centres across the world, fostered by standardised management algorithms such as Advanced Trauma Life Support (ATLS). Key steps of these algorithms are (Chapleau 2013):

  1. maintain airways and establish sufficient oxygenation (i.e. by intubation and tube thoracostomy in case of pneumo‐ or haematothorax);

  2. stop traumatic bleeding (e.g. by tourniquets on extremities, pelvic binders and external fixators, surgical or interventional control of haemorrhage, application of antifibrinolytics such as tranexamic acid, and transfusion of blood products, mainly coagulation factors).

Data from the German Trauma Registry suggest an overall mortality of 10% from severely injured patients managed within organised trauma networks and at high‐volume trauma centres (German Trauma Society 2014). There may be a biological threshold in trauma survivability that cannot be overcome by any of the treatment modalities currently available, and extra translational research efforts are needed to make a difference in future. Apart from unsurvivable brain and upper cervical spine injuries, the leading causes of early death in multiple trauma are chest injuries and abdominal and retroperitoneal haemorrhage (Pfeifer 2016). The presence of free fluid surrounding the liver or spleen, capsular tears, organ contusions or lacerations, and vascular lesions influences early decision making in major trauma.

Stabbing (by sharp tools or weapons such as knives) and shooting are associated with a high chance of organ or vessel injury. The distinct location of wounds may point towards significant trauma to the lungs, heart, mediastinum, liver, spleen, thoracic and/or abdominal aorta. The quality and quantity of injuries sustained in civilian settings and armed conflicts differ in many ways (e.g. by type of weapon, gun, or bullet, wound ballistics, protective armour, austere environment (i.e. where medical care is provided under less than optimal sanitary or hospital‐like conditions) and others). Most patients with penetrating trauma need immediate surgical exploration (specifically in case of haemodynamic instability), and preoperative imaging has a rather ancillary role in this situation.

In blunt trauma, however, radiographic imaging is an inevitable part of clinical work‐up. Physical examination may reveal indirect signs of internal injury (e.g. contusion marks), but these signs are inconsistent and neither sensitive nor specific. Computed tomography (CT) is regarded as the imaging standard in the emergency department and is currently also the undisputed diagnostic reference test in the trauma scenario. If patients are transferred immediately to the operating theatre before CT imaging, emergency laparotomy, laparoscopy, or thoracotomy is the reference standard of choice. If patients die in the emergency department before any imaging or surgical procedure can be undertaken, definitive diagnoses are obtained during pathological or forensic autopsy. Point‐of‐care sonography (POCS), however, can be performed during resuscitation, repeated wherever and whenever needed, and does not involve exposure to radiation.

Point‐of‐care sonography has emerged as an integral part of trauma algorithms and is the initial screening modality of choice for thoracoabdominal bleeding in most regions of the world. Like any other imaging procedure or diagnostic test used for screening purposes, it is important to verify that:

  1. a negative index test result is reliable for excluding the condition of interest (guaranteeing that episodes of haemodynamic instability during decompressive brain surgery or fixation of spine, pelvic, or femoral fractures are not caused by sudden major abdominal, thoracic, or retroperitoneal bleeding);

  2. a positive index test result is reliable for proving the condition of interest (thus minimising the number of negative or unnecessary thoracotomies and laparotomies, or their minimally invasive equivalents).

Both false‐negative and false‐positive findings of POCS may misguide trauma teams and affect care priorities adversely.

Diagnostic accuracy (or efficacy) is the first level of the Fryback‐Thornbury hierarchy of evaluating the utility of a diagnostic test procedure (Fryback 1991). While the value and utility of a certain test cannot be derived from its accuracy alone, it would be absurd to ask for the effectiveness or efficiency of an inaccurate diagnostic test.

Determining accuracy is thus the first indispensable step in health technology assessment of POCS. This review aimed to generate the best available evidence about the diagnostic accuracy of clinical ultrasound imaging protocols in the setting of thoracoabdominal and multiple trauma compared to appropriate reference standards. It will guide clinicians regarding the likelihood of chest and abdominal injuries given certain prior probabilities and ultrasound findings, and may facilitate the decision to perform a CT scan or to schedule patients for emergency laparoscopy or laparotomy, or other interventional procedures.

Given the (higher) potential utility and value of POCS in blunt compared to penetrating trauma, this review considered only original studies that included participants with blunt injuries or, in a mixed population, provided sufficient details to explore the accuracy of POCS in this group.

Another aspect that requires scrutiny is the use of POCS in paediatric trauma algorithms. Children are vulnerable to radiation for diagnostic purposes, and their lifetime‐attributable risk (LAR) of cancer due to medical imaging must be kept to the necessary minimum. Still, there may be situations in which acute and potentially life‐threatening conditions require radiation‐emitting (i.e. multi‐detector row computed tomography (MDCT)) rather than radiation‐free imaging techniques (e.g. POCS or magnetic resonance imaging (MRI)).

Index test(s)

Ultrasound has emerged as a standard for bedside imaging in emergency departments worldwide. Technological progress has led to increasingly lighter and mobile (i.e. handheld) equipment (also available in the preclinical setting, e.g. on helicopters or rescue vehicles). Further advancements include colour‐duplex, contrast‐enhanced imaging, and even three‐dimensional (3D) scanning.

In the trauma setting, POCS is typically performed as focused abdominal sonography for trauma (FAST) (Scalea 1999). In its basic form, FAST includes oblique views of the left upper, right upper, left lower, and right lower abdominal quadrants, as well as a sagittal scan of the mid‐abdomen and a transverse view of the pelvic region. The key target of the original FAST scan is free fluid as a surrogate of blood or active bleeding.

The genuine FAST protocol has been modified and supplemented in many ways. The most useful and technically simple extensions were to screen for haematothorax (using oblique or intercostal planes, or both) and, by a xiphoid view, for pericardial effusion. Point‐of‐care sonography has also proved to be reliable in detecting pneumothorax (Blaivas 2005). Skilled examiners may be able to show and grade abdominal organ injury, although this is likely to exceed the diagnostic limits of POCS in the early resuscitation phase.

In this review, we have used the term POCS rather than FAST because of the varying definitions and targets established in different centres and countries. In clinical practice, ultrasound (or ultrasonography) as an imaging technique is commonly abbreviated and understood as sonography. Altogether, the technological evolution of hardware, increasing skills of operators, and significant advancements in picture acquisition and processing have changed the view of healthcare providers about the role of ultrasonography in the critical care setting substantially. Ultrasound has evolved from a rough screening tool to a conclusive imaging modality.

The index test for this review was therefore any clinical POCS application performed in the setting of blunt trauma that is intended to detect direct or indirect signs of injuries of the thoracic, abdominal, or retroperitoneal cavity or space and/or its organs and vessels.

Clinical pathway

Clinical examination alone has little ‐ if any ‐ role in excluding injuries to the chest or abdomen. The presence of external injuries, such as seatbelt marks, may increase the likelihood of visceral tears, but their absence does not exclude important trauma. Currently, all major trauma algorithms incorporate thoracoabdominal POCS as a diagnostic imaging tool. However, the interpretation of ultrasound images depends on the experience and clinical background of individual operators. This subjective component influences decision making, and hampers comparisons between initial and follow‐up scans taken by different examiners. In 2013, Van Vugt and colleagues published an evidence‐based work‐up protocol for blunt trauma that illustrated the benefits of training trauma teams in POCS (FAST) in combination with an ATLS course (Van Vugt 2013).

Alternative test(s)

Currently, POCS is challenged by the liberal early use of MDCT, either as abdominal, thoracic, thoracoabdominal, or whole‐body MDCT. The latter has emerged as the diagnostic modality of choice in most European trauma centres, and is used in the USA and other high‐income nations as well. The so‐called 'pan‐scan' usually comprises a native cranial CT, followed by a contrast‐enhanced CT from the skull base to the pelvis and/or trochanteric region. Whole‐body MDCT is highly specific, thereby minimising false‐positive findings (Stengel 2012), and may thus influence care priorities according to the 'treat first what kills first' rule. Data from the German Trauma Registry suggest that the pan‐scan improves survival in both unselected trauma cohorts and haemodynamically unstable patients (Huber‐Wagner 2013). However, there are concerns regarding excess exposure to radiation caused by uncritical use of the pan‐scan at both the individual and population level (Asha 2012). While dose‐reducing reconstruction and processing algorithms are available, it is debatable whether they produce images that are similar in quality and diagnostic certainty to those produced by conventional protocols.

The pan‐scan is not only a competing imaging tool; it is also regarded as the diagnostic reference standard to which POCS findings must be compared. This leads to an interesting methodological conflict, as it will be almost impossible to compare both imaging modalities in a head‐to‐head fashion in the trauma scenario.

Rationale

In high‐income countries, it is doubtful whether POCS findings influence treatment decisions in severe trauma. This can be illustrated by the following four possible scenarios.

  1. POCS is positive for free abdominal or thoracic fluid, or both, in a haemodynamically stable patient. This will prompt a CT scan (usually a pan‐scan) to identify bleeding sources. In most cases, haemostatic transfusion (plus transarterial embolisation (TAE)) and intensive care unit (ICU) monitoring will be the treatment of choice in this setting.

  2. POCS is negative for free abdominal or thoracic fluid, or both, in a haemodynamically stable patient. This will prompt a CT scan (usually a pan‐scan) to verify that there are no active bleeding sources that were missed by ultrasound.

  3. POCS is negative for free abdominal or thoracic fluid, or both, in a haemodynamically unstable patient. This will almost always prompt a CT scan (usually a pan‐scan) to identify bleeding sources and to decide about TAE or emergency surgery, or both.

  4. POCS is positive for free abdominal or thoracic fluid, or both, in a haemodynamically unstable patient. Currently, it is unlikely that stability could not be achieved by haemostatic resuscitation and other critical care efforts to make patients pan‐scan ready.

Scenario 4 is relevant, but rare, in the Western world. There are very few occasions in which all resuscitation efforts fail and patients are scheduled for emergency thoracotomy or laparotomy, or both, based on POCS findings alone. Still, these situations occur, and clinical practice guidelines must include recommendations on how to cope with them.

In middle‐ and low‐income countries, however, POCS (in addition to conventional radiographs) may represent the most sophisticated or only non‐invasive diagnostic tool available to detect significant traumatic haemorrhage and guide triage. The Sichuan earthquake in 2008, which killed 69,197 people and left 18,222 missing, was a classic example. Focused abdominal sonography for trauma ultrasound proved to be effective, efficient, and possibly lifesaving under these exceptional circumstances (Zhou 2012). Similar observations were made after the earthquake in Haiti in 2010. The earthquake in Nepal in April 2015 (which killed more than 6000 and left 2.8 million people homeless) demonstrated how FAST can play a role in triaging patients effectively outside the context of clinical research.

Objectives

available in

To determine the diagnostic accuracy of POCS for detection and exclusion of:

  1. free fluid in the thoracic or abdominal cavities;

  2. organ injuries with or without bleeding in the thoracic or abdominal cavities;

  3. vascular lesions of the thoracic or abdominal aorta, or other major vessels; and

  4. other injuries (e.g. pneumothorax);

compared to the following diagnostic reference standards: computed tomography (CT; 'pan‐scan'), magnetic resonance imaging (MRI), thoracotomy, laparotomy, laparoscopy, thoracoscopy, autopsy, or any combination of these.

Secondary objectives

The secondary objectives of this review were to investigate the influence of individual study and cohort characteristics such as the:

  1. reference standard;

  2. target condition;

  3. patient age (paediatric versus non‐paediatric);

  4. patient disease status: type of trauma, type of injury, haemodynamic stability, injury severity or probability of survival;

  5. environment;

  6. operator's expertise and background;

  7. hardware;

  8. test thresholds;

on both positive and negative POCS scans.

More details are provided in the Investigations of heterogeneity section of the review.

Methods

available in

Criteria for considering studies for this review

Types of studies

We included:

  1. either prospective or retrospective diagnostic cohort studies that enrolled patients with blunt trauma who:

    1. underwent any type of POCS as primary imaging modality to screen for thoracoabdominal injuries; and

    2. also underwent predefined imaging or invasive reference tests to verify POCS results;

  2. studies that provided 2 x 2 tables (or sufficient information to tabulate results) to allow for calculating sensitivity, specificity, and other indices of diagnostic test accuracy.

We excluded:

  1. diagnostic case‐control studies comparing patients with known case status to healthy controls, as this creates artificial populations and tends to overestimate sensitivity of the index test;

  2. case series and case reports;

  3. studies with unclear index or reference tests; and

  4. studies that did not allow for creating 2 x 2 tables.

Participants

The target population of this review comprised people of any age or gender who sustained any type of blunt trauma in a civilian scenario and were transferred to a hospital of any care level. Also, in order to be eligible participants had to have undergone POCS as the primary imaging tool and to have been followed up either as inpatients or outpatients with different diagnostic modalities to verify whether the condition of interest was present or absent.

Because of clear differences in clinical management, we deliberately excluded people with penetrating injuries, as well as members of armed forces wounded in the battlefield.

Index tests

Any type of POCS performed in a trauma setting (e.g. FAST ultrasonography of the abdomen or thorax, or both, or any advanced ultrasound protocol) intended to detect:

  1. free fluid (as a surrogate of bleeding) in the abdomen, retroperitoneal space, or chest;

  2. injuries to solid organs such as the liver or spleen (including attempts to grade their severity);

  3. lesions of major vessels; and

  4. other injuries (e.g. pneumothorax, as indicated by air in the pleural space).

Variation in POCS technology and application (e.g. specification of ultrasound machines and probes and how up‐to‐date they were, and handling of inconclusive test results) is addressed in the Assessment of methodological quality section of the review. We planned to examine its potential influence on diagnostic accuracy estimates in the Investigations of heterogeneity section of the review.

Target conditions

This review focused on blunt thoracoabdominal and multiple trauma, meaning any blunt, non‐penetrating force to the abdomen and chest and both solid and hollow viscera, as well as both major vessels. Target conditions considered by this review included:

  1. free fluid in the:

    1. thoracic cavity (uni‐ or bilateral, where specified);

    2. abdominal cavity (by abdominal quadrant, where specified);

    3. retroperitoneal space;

    4. pericardium; or

    5. mediastinum;

  2. organ injuries, defined as:

    1. liver injuries (e.g. capsular tears, haematoma, tissue lacerations);

    2. splenic injuries (e.g. capsular tears, haematoma, tissue lacerations);

    3. injuries to other solid organs (e.g. pancreas, kidneys);

    4. injuries to hollow viscera; or

    5. any other organ laceration detected by ultrasonography;

  3. vascular lesions, defined as:

    1. dissection or rupture of the thoracic or abdominal aorta, or both;

    2. rupture of other vessels such as the iliac arteries;

  4. other injuries (e.g. pneumothorax, as indicated by air in the pleural space in the thoracic cavity).

We analysed the effect of different types of target conditions as part of our Investigations of heterogeneity. We categorised target conditions into surrogates of blunt trauma (i.e. free fluid and free air, named limited assessment), and both surrogates and direct signs of organ damage (i.e. organ injuries and vascular lesions, named complete assessment).

Reference standards

In order to be accepted as a diagnostic reference standard, the deliberate use (and the reasoning for its use) of the particular method needed to be specified. To avoid verification bias, all participants were required to undergo an independent imaging or invasive test, regardless of the initial POCS scan.

We classified the following tests as reference standards to confirm the presence or absence of the target condition:

  1. any type of CT scan of the major body cavities (i.e. chest, abdomen, pelvis), either selective or performed as a whole‐body scan. We planned to stratify results for the use of intravenous or oral contrast agents, or both, and the time interval between POCS and CT;

  2. any type of MRI of the major body cavities;

  3. laparotomy (by a median or transverse approach), or laparoscopy, either diagnostic or therapeutic;

  4. thoracotomy (by median sternotomy or a clamshell approach), or thoracoscopy, either diagnostic or therapeutic;

  5. autopsy, either done by pathologists or forensic examiners.

Search methods for identification of studies

We developed a reproducible search strategy in major online databases based on recommendations of the Cochrane Diagnostic Test Accuracy (DTA) Group and a systematic review performed previously (Stengel 2005). We sought assistance and advice from the Cochrane Injuries Group and its Information Specialist to create a search algorithm with high sensitivity. We also requested access to the Cochrane Injuries Group Specialised Register and searched the Cochrane Library for relevant studies included in published reviews. Furthermore, we used a snowball procedure to identify related articles and articles cited in the reference lists of individual publications, and used Google Scholar as an additional search tool.

Electronic searches

We searched the following electronic sources.

  1. Ovid MEDLINE (1946 to 15 July 2017).

  2. PubMed (not MEDLINE) (1947 to 15 July 2017).

  3. Ovid Embase (1974 to 15 July 2017).

Search strategies are shown in Appendix 1.

We performed a further search on 6 December 2018; details of the eight potentially relevant studies identified have been added to the Characteristics of studies awaiting classification and Studies awaiting classification sections, and may be incorporated into the review at the next update.

Searching other resources

A systematic review by Scherer and colleagues showed that results from studies that have not been published in a full‐text format are systematically different from fully published results (Scherer 2007). We therefore searched the BIOSIS database for conference abstracts to identify potentially relevant studies that had not yet been published in a journal format (see Appendix 1).

We planned to contact authors of individual studies by email, letter, or phone, if we considered their results to be important but needed further explanation or raw data. We guaranteed that any data exchange complied with the International Conference on Harmonisation Good Clinical Practice (ICH‐GCP) principles and rules and regulations of data safety and security.

Data collection and analysis

We employed standard operating procedures (SOP) for the selection of studies, data extraction, and recording. This included the following principles:

  1. screening of titles, abstracts, and full texts of study reports identified by the search strategy by two review authors working independently;

  2. use of a data extraction form (including individual study characteristics, individual patient profiles, definition of procedures, etc.);

  3. dual assessment and data entry;

  4. dual assessment of methodological quality of individual studies;

  5. resolution of conflicts by a third review author.

This guarantees transparency and adherence to Cochrane standards and other recommendations (e.g. those issued by the EQUATOR (Enhancing the QUAlity and Transparency Of health Research) group).

Selection of studies

Two review authors (AH, JL) independently screened the titles and abstracts of the identified reports, documenting details of selected studies in a predefined electronic spreadsheet and assessing studies for eligibility in terms of the predefined inclusion and exclusion criteria. If it was not possible to make a decision based on title and abstract alone, the full texts of potentially relevant studies were assessed. Any disagreements between authors regarding the selection of studies were resolved by a third expert (DS). The study selection process is documented in a detailed flow chart (Figure 1).


Study flow diagram for the search conducted on 15 July 2017.

Study flow diagram for the search conducted on 15 July 2017.

Data extraction and management

As stated above, we established an SOP for data extraction for systematic reviews, meta‐analyses, and health technology assessment (HTA) reports. We adhered to ICH‐GCP, Good Epidemiological Practice (GEP), and other relevant rules and recommendations. We have trained personnel on site to record, manage, and audit data, and our data storage modes comply with federal legislation on data safety for research purposes. Two review authors (AH, JL) independently extracted data from original papers in duplicate, and resolved discrepancies by discussion, moderated by a third review author (DS). They extracted the following information from published papers:

  1. study characteristics (author, year of study, year of publication, journal reference, study design, inclusion/exclusion criteria, operator characteristics, hardware specifications, index test used, reference test used, general setting (urban/rural), mass casualty (yes/no));

  2. patient characteristics (age, gender, type of trauma, type of injury, injury severity, haemodynamic stability, probability of survival);

  3. outcome of the index test as assessed in the individual studies by diagnosing the target condition and, if available, the number of participants with inconclusive results or who had no test result;

  4. diagnostic 2 x 2 tables, cross‐classifying the disease status on the basis of the reference test (i.e. number of true‐positive, false‐positive, false‐negative, and true‐negative results).

Diagnostic accuracy was expressed by individual and pooled indicators such as sensitivity and specificity with 95% confidence intervals (CI), positive and negative likelihood ratios (LR), positive predictive value (PPV), negative predictive value (NPV), and the summary receiver operating characteristic curve (SROC).

Assessment of methodological quality

Two review authors (AH, JL) independently used the QUADAS‐2 tool (the revised and updated version of the original Quality Assessment of Diagnostic Accuracy Studies list of items) to assess methodological quality of individual studies (Whiting 2011). Any discrepancies were resolved by discussion, moderated by a third review author (DS). QUADAS‐2 includes four main domains, namely: patient selection, index test, reference standard, and flow and timing. We assessed each domain with regard to risk of bias, rating them as 'low', 'high', or 'unclear'. We assessed concerns regarding applicability only for the first three domains, categorising them as 'low', 'high', or 'unclear'. Signalling questions were answered as 'yes', 'no', or 'unclear' (see also Appendix 2). By using tailored review‐specific signalling questions we were able to perform a custom‐made assessment of the methodological quality of all included studies. We omitted the signalling question "Was any case‐control design avoided?" in Domain 1: Patient selection since we did not include any case control study, case series, or case reports. We added three signalling questions to Domain 2: Index test and two signalling questions to Domain 3: Reference test referring to operator's expertise and background, technical features of the hardware, and appropriateness of the ultrasound protocol and reference imaging standard. In Domain 4: Flow and timing, we included the signalling question "Did all participants receive a reference standard?" in order to explore the risk of partial verification bias.

Statistical analysis and data synthesis

If at least one of the target conditions was detected (i.e. pneumothorax, free fluid, organ or vessel injury), we considered the patient (participant) to be traumatised or test‐positive. Otherwise, we considered the participant to be uninjured or test‐negative. Our observational unit of interest was thus the individual participant, not a particular injury, and we did not evaluate single target conditions separately in the primary analysis. We used inconclusive test results as reported in the primary studies. For individual studies, we calculated sensitivity and specificity with 95% CI, tabulated pairs of sensitivity and specificity with CI, and depicted estimates by coupled forest plots using Review Manager 5 (RevMan 5) (Review Manager 2014).

Due to the subjective nature of the interpretation of POCS findings, we expected implicit thresholds in test positivity. We assessed a possible threshold effect visually by plotting true‐positive rates (sensitivity) from each study against false‐positive rates (1 − specificity) in a receiver operating characteristic (ROC) space and coupled forest plots of sensitivity and specificity. Since the dichotomous operationalisation of the test result does not enable explicit thresholds, we used the bivariate model according to Reitsma 2005, which is a robust statistical model taking the underlying relationship between sensitivity and specificity into account. The random‐effects approach allows for calculating sensitivity and specificity estimates while controlling for heterogeneity across studies. We fitted the models in Stata (Stata 2017) using the 'metandi' command and produced SROC plots using RevMan 5. We estimated average sensitivities and specificities using the bivariate model. We obtained likelihood ratios post estimation using the parameters of the bivariate model (see summary of findings Table 1).

Investigations of heterogeneity

We assessed heterogeneity visually by inspecting the coupled forest plots and plots of study results in the SROC space. We also investigated possible sources of heterogeneity by adding single covariates to the basic bivariate random‐effects model. We conducted fitting of the bivariate model via the 'xtmelogit' command in Stata (Takwoingi 2016). We investigated the effect of adding covariates by conducting a likelihood ratio test that compared the ‐2 log likelihoods of the basic bivariate model to a model including a single covariate. If a significant reduction in the ‐2 log likelihood was detected (indicated by a P value of < 0.05), test performance was considered to be associated with the particular covariate. For statistically significant test results, we determined whether the covariate was associated with the estimated sensitivity, specificity, or both (Macaskill 2011), by removing the covariate terms for either sensitivity or specificity, and comparing the fit of each alternative model using likelihood ratio tests.

For tests of heterogeneity, we required a minimum of 10 studies in total and at least four studies per subgroup. We had to dichotomise the covariates we investigated, and differentiate between paediatric and non‐paediatric (i.e. adult/mixed populations), surrogates of injury (e.g. air, free fluid) and organ lacerations, abdominal injuries (i.e. injuries exclusively located in the abdomen) or chest injuries (i.e. injuries exclusively located in the chest), and single CT versus CT plus laparotomy used as reference standard (see summary of findings Table 2).

Sensitivity analyses

We performed sensitivity analyses to investigate how individual QUADAS‐2 key domains (i.e. patient selection, index test, reference standard, and flow and timing) affected accuracy estimates, and to explore whether the different evaluations of the two independent review authors within two original studies influenced pooled sensitivities or specificities, or both, of the index test. Moreover, we examined the impact of participants' age on accuracy estimates by only including paediatric studies.

Assessment of reporting bias

We did not assess reporting bias because there are no accepted ways of doing this for diagnostic test accuracy studies (Deeks 2005).

Results

Results of the search

We conducted the electronic search on 15 July 2017 in Ovid MEDLINE, PubMed, and Ovid Embase, applying the strategy shown in Appendix 1. We identified 2872 publications including 576 duplicates (Figure 1). After screening the titles or abstracts of 2296 records, 91 studies remained for further evaluation according to our predefined inclusion and exclusion criteria (Types of studies). After screening the full texts of these 91 studies, we discarded 57 and included 34. We regarded the published data as sufficient to answer our research questions and so did not require individual author contact.

Included studies

We extracted information from the 34 included studies according to predefined criteria (Characteristics of included studies). The included studies compared POCS to various imaging and surgical standards (i.e. CT, conventional radiography, laparotomy, thoracotomy, and autopsy) and were published between 1992, Tso 1992, and 2017, Calder 2017, with sample sizes ranging from 51 participants in Benya 2000 to 3181 in Becker 2010. Retrospective and prospective designs were equally distributed, and half of all investigations were conducted in the USA. Ten studies enrolled only children and adolescents, with the age of participants ranging from 1 to 18 years (Benya 2000; Calder 2017; Coley 2000; Corbett 2000; Emery 2001; Fox 2011; Menichini 2015; Soudack 2004; Valentino 2010; Zhou 2012). Two studies included only adults (Blaivas 2005; Verbeek 2014), and 22 studies enrolled participants of any age. Four studies addressed thoracic trauma exclusively (Blaivas 2005; Nandipati 2011; Ojaghi 2014; Zhang 2006). Half of all participants were admitted to level I trauma centres.

Excluded studies

Fifty‐five studies did not meet the inclusion criteria and were excluded (Characteristics of excluded studies). The main reasons for exclusion were insufficient information to allow for calculating diagnostic accuracy (n = 20), missing specification of or improper reference standards (n = 16, i.e. follow‐up ultrasound examination, diagnostic peritoneal lavage (DPL), or clinical observation), or penetrating injuries (n = 13). Reasons for exclusions are summarised in Figure 1.

Methodological quality of included studies

We evaluated the methodological quality of individual studies using the QUADAS‐2 tool and summarised quality assessments per fulfilled QUADAS‐2 domain (Figure 2; Figure 3). Poor reporting, especially in the patient selection and reference standard domains, hampered conclusive judgements about the risk of bias. Only five studies had a low risk of bias in all four critical domains (Benya 2000; Coley 2000; Emery 2001; Soudack 2004; Verbeek 2014). We rated at least two risk‐of‐bias domains as unclear or high in 19 studies.


Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study


Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Patient selection

Three studies had a high risk of bias with regard to patient selection due to non‐consecutive enrolment of participants (e.g. inconsistent availability of designated sonographers, refusal of informed consent) and inappropriate exclusions (e.g. exclusion of patients with underlying diseases associated with intra‐abdominal fluid) (Ojaghi 2014; Talari 2015; Zhou 2012). Eleven studies had an unclear risk of bias due to non‐consecutive enrolment of patients (Becker 2010; Blaivas 2005; Cheung 2012; Clevert 2008; Corbett 2000; Fox 2011; Hsu 2007; Kendall 2009; Smith 2010; Valentino 2010; Wong 2014). We rated the patient inclusion procedure as unclear for 10 studies (Calder 2017; Catalano 2009; Dolich 2001; Friese 2007; Iqbal 2014; Kärk 2012; McKenney 1994; Nandipati 2011; Todd Miller 2003; Zhang 2006).

Index test

Unclear risk of bias ratings in both index test and reference standard domains originated mainly from missing or unclear information about hardware standards (e.g. machine specifications missing, no information on the number of imaging planes, etc.) or the qualification of operators, or both. Reported qualifications of POCS examiners ranged from attendance of an eight‐hour ultrasound course in Hsu 2007 to 10 years of experience in Menichini 2015. We rated eight studies as at unclear risk of bias due to a lack of information regarding the skills of sonographers and insufficient specification about whether index test results were interpreted without knowledge of other imaging test results (Calder 2017; Hsu 2007; Kärk 2012; Kendall 2009; McElveen 1997; McKenney 1994; Wong 2014; Zhou 2012).

Reference standard

We rated 12 studies as having an unclear risk of bias due to missing technical specifications for the reference imaging test (Calder 2017; Cheung 2012; Dolich 2001; Iqbal 2014; Kärk 2012; Kumar 2015; McElveen 1997; Nandipati 2011; Smith 2010; Tso 1992; Wong 2014; Zhou 2012). Information concerning the diagnostic reference standard was generally far scarcer than details about the index test.

Flow and timing

Most studies had a low risk of bias with regard to the examination flow and timing domain. Two studies employed diagnostic reference standards conditional on the result of ultrasound exams in some, McElveen 1997, or all, Menichini 2015, of the examined participants.

Findings

Diagnostic performance of individual studies comparing POCS with reference standard

Coupled forest plots of individual studies' sensitivities and specificities along with true positives (TP), false positives (FP), false negatives (FN), and true negatives (TN) are depicted in Figure 4. The sensitivity of POCS ranged from 0.26 (95% CI 0.14 to 0.42) to 1.00 (95% CI 0.74 to 1.00). Specificity ranged from 0.59 (95% CI 0.44 to 0.73) to 1.00 (95% CI 0.97 to 1.00). A graphical interpretation of coupled forest plots of individual studies' sensitivities and specificities did not indicate any threshold effect, therefore we considered the bivariate model to be the appropriate pooling procedure.


Coupled forest plots of sensitivity and specificity. TP = true positive; FP = false positive; FN = false negative; TN = true negative

Coupled forest plots of sensitivity and specificity. TP = true positive; FP = false positive; FN = false negative; TN = true negative

Estimates derived from the bivariate model comparing POCS with reference standard

Figure 5 shows the pooled summary point for sensitivity and specificity derived from the bivariate model with corresponding 95% confidence and prediction regions. Summary estimates of sensitivity and specificity were 0.74 (95% CI 0.65 to 0.81) and 0.96 (95% CI 0.94 to 0.98). Corresponding positive and negative LRs were 18.5 (95% CI 10.8 to 40.5) and 0.27 (95% CI 0.19 to 0.37); PPV was 0.88 (95% CI 0.81 to 0.94); and NPV was 0.90 (95% CI 0.87 to 0.93). The observed median prevalence of blunt thoracoabdominal trauma in the total cohort was 28%. In a virtual population of 1000 patients, assuming the median prevalence of 28%, POCS would miss 73 patients with injuries, and falsely suggest the presence of injuries in another 29 patients.


Summary receiver operating characteristic (ROC) plot of sensitivity and specificity of all 34 included studies. The solid circle represents the summary estimate of sensitivity and specificity. The summary estimate is surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity of all 34 included studies. The solid circle represents the summary estimate of sensitivity and specificity. The summary estimate is surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

Heterogeneity

The prediction region around the summary estimate in Figure 5 indicates with 95% confidence where the true sensitivity and specificity of POCS would be expected in a future study. As indicated by the width of the region, there was considerable heterogeneity between studies. Regarding sensitivity, the 95% prediction region varied from 0.14 to 0.98, while the specificity of future studies was estimated to range from 0.42 to 1.00. This marked between‐study heterogeneity needs further exploration.

a. Effect of reference standard

Each individual study used CT as confirmative imaging modality, either as a single gold standard or in combination with other reference tests. In 25 studies, target conditions were confirmed exclusively with CT, and in seven studies with CT and laparotomy. There was no difference in POCS sensitivity and specificity when compared with CT or CT plus laparotomy (Chi2 = 0.18; P value (overall effect) = 0.9160; CT: 0.75 (95% CI 0.63 to 0.84) and 0.97 (95% CI 0.93 to 0.98), CT plus laparotomy: 0.73 (95% CI 0.58 to 0.84) and 0.95 (95% CI 0.87 to 0.98)).

b. Effect of target condition

Twenty‐two studies assessed the diagnostic accuracy of POCS targeting surrogate measures like free fluid (18 studies) or air (four studies). Three studies aimed to assess solid organ damage, and another seven studies targeted both free fluid and direct signs of organ injuries. The individual target condition mainly affected specificity estimates (Chi2 = 9.10; P value (overall effect) = 0.0106). Sensitivity of POCS limited to detecting free fluid/air was 0.78 (95% CI 0.68 to 0.85), compared to 0.80 (95% CI 0.73 to 0.85) for complete assessment (Chi2 = 0.06; P value (pair‐wise) = 0.8100). Related specificities were 0.97 (95% CI 0.96 to 0.99) and 0.88 (95% CI 0.70 to 0.96), respectively (Chi2 = 8.08; P value (pair‐wise) = 0.0045). Coupled forest plots for limited assessment (Figure 6) and complete assessment (Figure 7) show greater variation in specificity in studies targeting both free fluid and direct signs of organ injuries compared to studies aimed only at free fluid or free air.


Coupled forest plots of sensitivity and specificity for studies targeting only free fluid or free air (n = 22). TP = true positive; FP = false positive; FN = false negative; TN = true negative

Coupled forest plots of sensitivity and specificity for studies targeting only free fluid or free air (n = 22). TP = true positive; FP = false positive; FN = false negative; TN = true negative


Coupled forest plots of sensitivity and specificity for studies considering both surrogates and organ lacerations (n = 7). TP = true positive; FP = false positive; FN = false negative; TN = true negative

Coupled forest plots of sensitivity and specificity for studies considering both surrogates and organ lacerations (n = 7). TP = true positive; FP = false positive; FN = false negative; TN = true negative

c. Effect of participant age

Ten studies included only children under 18 years of age, whereas 24 studies involved adults or a largely adult population. Participant age was associated with significantly different estimates of both sensitivity and specificity (Chi2 = 7.32; P value (overall effect) = 0.0258). Pooled sensitivity of POCS was 0.63 (95% CI 0.46 to 0.77) in children and 0.78 (95% CI 0.69 to 0.84) in an adult or mixed population (Chi2= 54.91; P value (pair‐wise) < 0.0001). Associated specificities were 0.91 (95% CI 0.81 to 0.96) and 0.97 (95% CI 0.96 to 0.99) (Chi2 = 19.88; P value (pair‐wise) < 0.0001). Figure 8 depicts individual sensitivity and specificity estimates along with summary points, 95% confidence regions, and 95% prediction regions for both paediatric and non‐paediatric studies. Trials including only children are depicted by means of black dots, while trials with a predominantly adult population are shown as red dots. Figure 4 illustrates sensitivities and specificities from individual studies for non‐paediatric (first 24 studies) and paediatric populations (last 10 studies).


Summary receiver operating characteristic (ROC) plot of sensitivity and specificity: paediatric studies (n = 10; indicated in black) versus non‐paediatric studies (n = 24; indicated in red). The solid circles represent the summary estimates of sensitivity and specificity. The summary estimates are surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity: paediatric studies (n = 10; indicated in black) versus non‐paediatric studies (n = 24; indicated in red). The solid circles represent the summary estimates of sensitivity and specificity. The summary estimates are surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

d. Effect of type of injury

Twenty‐seven studies targeted abdominal injuries; four studies addressed thoracic trauma; two studies addressed both; and one study examined blunt truncal trauma without further specification. We tested the influence of the anatomic region on the basic model by adding a binary covariate that dichotomised the injury type into thoracic or abdominal trauma. Based on 31 studies, this led to significantly different estimates of sensitivity and specificity (Chi2 = 17.36; P value (overall effect) = 0.0002). Sensitivity of POCS for abdominal and thoracic trauma was 0.68 (95% CI 0.59 to 0.75) and 0.96 (95% CI 0.88 to 0.99), respectively (Chi2 = 13.22; P value (pair‐wise) = 0.0003). Specificity was 0.95 (95% CI 0.92 to 0.97) and 0.99 (95% CI 0.97 to 1.00), respectively (Chi2 = 5.39; P value (pair‐wise) = 0.0202). Individual sensitivities and specificities for abdominal trauma, thoracic trauma, and trauma that is not exclusively abdominal or thoracic (i.e. both abdominal and thoracic trauma, truncal trauma) are displayed in Figure 4. Individual and average estimates of sensitivity and specificity, and both 95% confidence regions and 95% prediction regions around the summary estimates for thoracic and abdominal studies separately are illustrated in Figure 9. For abdominal trauma, the accuracy values are widely scattered across studies, whereas sensitivity and specificity values are consistently high when targeting only thoracic trauma.


Summary receiver operating characteristic (ROC) plot of sensitivity and specificity: abdominal studies (n = 27; indicated in black) versus thoracic studies (n = 4; indicated in red). The solid circles represent the summary estimates of sensitivity and specificity. The summary estimates are surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity: abdominal studies (n = 27; indicated in black) versus thoracic studies (n = 4; indicated in red). The solid circles represent the summary estimates of sensitivity and specificity. The summary estimates are surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

Sensitivity analysis

a. Effect of study quality

We performed sensitivity analyses to investigate the effect of study quality on diagnostic accuracy estimates separately for each of the QUADAS‐2 key domains. Study quality did not have a substantial effect on either sensitivity or specificity estimates in any of the four domains. Sensitivity and specificity estimates for studies with low risk of bias were as follows: patient selection (9 studies) 0.75 (95% CI 0.63 to 0.85) and 0.93 (95% CI 0.83 to 0.98), index test (22 studies) 0.77 (95% CI 0.66 to 0.85) and 0.97 (95% CI 0.94 to 0.98), reference standard (11 studies) 0.77 (95% CI 0.60 to 0.89) and 0.97 (95% CI 0.93 to 0.99), flow and timing (30 studies) 0.74 (95% CI 0.65 to 0.81) and 0.96 (95% CI 0.93 to 0.98).

b. Effect of two independent reviewers within original studies

In two original studies (Benya 2000; Ghafouri 2016), sonograms and CT examinations were evaluated independently by two trialists, resulting in two different accuracy values for each reviewer. In order to analyse the influence of reviewers' decision on pooled accuracy estimates, we added both reviewers' lower accuracy estimates (Main analysis set) to one set and both reviewers' higher accuracy estimates to another set (Sensitivity analysis set) and assessed the differences; we detected no difference in either diagnostic accuracy estimate. The summary estimates of sensitivity and specificity along with CIs were identical in both sets with sensitivity estimates of 0.74 (95% CI 0.65 to 0.81) and specificity estimates of 0.96 (95% CI 0.94 to 0.98). The positive and negative LRs differed marginally with 20.9 (95% CI 12.0 to 36.5) and 0.27 (95% CI 0.20 to 0.37) in the Main analysis set compared with 20.8 (95% CI 11.8 to 36.8) and 0.27 (95% CI 0.20 to 0.37) in the Sensitivity analysis set.

c. Effect of patient age

When including only studies with patients under 18 years of age, sensitivity and specificity values were lower than in the analysis that included adults. In the 10 paediatric studies with 1384 participants, sensitivity was estimated at 0.62 (95% CI 0.47 to 0.75) and specificity at 0.91 (95% CI 0.81 to 0.96) (see summary of findings Table 1). Positive LR with 6.9 (95% CI 2.5 to 18.8), PPV with 0.76 (95% CI 0.53 to 0.89), and NPV with 0.84 (95% CI 0.77 to 0.90) were lower, and negative LR was higher with 0.42 (95% CI 0.26 to 0.65) than in the complete set of studies. In a virtual cohort of 1000 children having sustained blunt trauma, thoracoabdominal injuries would be missed in 118 cases (compared to 73 in the overall cohort), and 62 children would be falsely diagnosed as having sustained injuries (compared to 29 in the overall cohort).

Discussion

available in

Summary of main results

(See summary of findings Table 1)

In this systematic review, we included 34 studies with 8635 participants that evaluated the diagnostic accuracy of point‐of‐care sonography (POCS) for diagnosing thoracoabdominal injuries in patients with blunt trauma. Summary estimates of sensitivity and specificity were 0.74 (95% confidence interval (CI) 0.65 to 0.81) and 0.96 (95% CI 0.94 to 0.98). Corresponding positive and negative likelihood ratios were 18.5 (95% CI 10.8 to 40.5) and 0.27 (95% CI 0.19 to 0.37), respectively. There was no threshold effect. We judged risk of bias as largely unclear due to insufficient information, especially in terms of patient selection and reference standard domains.

There was significant heterogeneity in both sensitivity and specificity across studies, which was partly explained by patient age, type of injury, and target condition. In children, the pooled sensitivity of POCS was 0.63 (95% CI 0.46 to 0.77), compared to 0.78 (95% CI 0.69 to 0.84) in an adult or mixed population. Associated specificity was 0.91 (95% CI 0.81 to 0.96) and 0.97 (95% CI 0.96 to 0.99). Pair‐wise comparisons for both sensitivity and specificity yielded P values less than 0.0001. Taking into account the rather large number of studies in both groups (i.e. 10 paediatric studies versus 24 non‐paediatric studies), this indicates a real difference between both groups that cannot be explained by chance or other characteristics. For abdominal trauma, POCS had a sensitivity of 0.68 (95% CI 0.59 to 0.75) and a specificity of 0.95 (95% CI 0.92 to 0.97). For chest injuries, sensitivity and specificity were 0.96 (95% CI 0.88 to 0.99) and 0.99 (95% CI 0.97 to 1.00), respectively. However, only four studies targeted chest injuries exclusively, none of which enrolled children, for whom accuracy estimates appear to be lower generally. The individual target condition mainly affected specificity estimates, with a specificity of 0.97 (95% CI 0.96 to 0.99) for evaluations limited to free fluid/air and a specificity of 0.88 (95% CI 0.70 to 0.96) for complete assessments that also included direct signs of organ damage.

Summary estimates of sensitivity and specificity remained similar in studies at low risk of bias across all four domains (ranging from 0.74 to 0.77 and 0.93 to 0.97, respectively). When only children were included, summary estimates were lower compared to the main analysis, with a sensitivity of 0.62 (95% CI 0.47 to 0.75) and a specificity of 0.91 (95% CI 0.81 to 0.96).

In a virtual cohort of 1000 patients, assuming the observed median prevalence of thoracoabdominal trauma of 28%, POCS would miss 73 patients with injuries, and falsely suggest the presence of injuries in another 29 patients. In a children‐only cohort, POCS would miss 118 patients with injuries, and falsely suggest the presence of injuries in another 62 patients.

Strengths and weaknesses of the review

We performed a comprehensive literature search in major electronic databases using a reproducible retrieval strategy. With 2296 screened records and 34 eligible studies, we are confident the data set constitutes, at minimum, a representative sample and, at best, a complete set of studies investigating the diagnostic accuracy of POCS in patients with blunt trauma. We included diagnostic test accuracy (DTA) filters in our Ovid MEDLINE and Ovid Embase search strategies as suggested by the Information Specialist of the Cochrane Injuries Group and in adherence with the protocol. Methodological search filters are generally used to cut down large numbers of primary studies and to focus the search on the most relevant citations (Lefebvre 2017). Current recommendations do not support the use of methodological filters, as they may impair sensitivity and precision (Beynon 2013). Although we are not aware of any major accuracy study missed by our search algorithm, the use of a methodological search filter might represent a potential limitation in this review.

The relatively large number of investigations allowed for pooling summary estimates of sensitivity and specificity, and for exploring potential sources of heterogeneity. Using tailored review‐specific signalling questions in the QUADAS‐2 tool allowed us to perform a custom‐made assessment of the methodological quality of all included studies. Unfortunately, assessment of the methodological quality was impeded due to considerable under‐reporting in original studies. We investigated the influence of poor study quality by conducting sensitivity analyses separately for each risk‐of‐bias domain, which showed that study quality did not influence diagnostic accuracy estimates markedly.

There was substantial heterogeneity between studies, visible by the rather wide 95% prediction region in Figure 5. As there are more patients with than without the target condition, the 95% prediction region for sensitivity (0.14 to 0.98) was larger than that for specificity (0.42 to 1.00). However, we were able to explain heterogeneity partly by study characteristics such as participants' age, target condition, and type of injury. The performance of POCS in children remains controversial (Holmes 2007). The lower specificity observed in this review may be explained by the disproportionately larger number of complete assessments of abdominal injuries in paediatric compared to adult studies.

The higher sensitivity and specificity of POCS in studies examining only the thorax in comparison with studies focusing on the abdomen is in agreement with previous reviews (Alrajab 2013; Alrajhi 2012; Ding 2011; Ebrahimi 2014). In our review, an evaluation of both free‐fluid and organ injuries by POCS resulted in lower specificity than a complete evaluation, which is in agreement with published results (e.g. Poletti 2003).

Due to missing information, we were unable to explore some characteristics (i.e. environment, operators' expertise and background, hardware, test thresholds) as sources of heterogeneity. Only two studies described the handling of inconclusive results: in Iqbal 2014, inconclusive results were handled as positive test results, and in Dolich 2001, indeterminate results were excluded from sensitivity and specificity calculations, and the percentage of indeterminate results was only 1%. We do not expect the generally low number of inconclusive test results in the primary studies to affect our results. We had to modify our original categorisation of reference standards and participant age to investigate heterogeneity. Since computed tomography (CT) was used as a reference test in every single study, we compared the diagnostic accuracy of POCS to CT and CT plus laparotomy. Given the small number of studies including adults exclusively, we decided to compare children‐only cohorts against studies including adults or a mixed‐age population. We preferred this approach over splitting data into two parts based on participants' median or mean age.

We classified participants as test positive if any one of the target conditions was detected, irrespective of the fact that target conditions could differ between index test and reference standard. The majority of original studies (i.e. 25 of 34) used similar target conditions for index test and reference standard, and thus did not cause a mismatch. However, non‐transparent reporting in the remaining studies prevented us from correlating the source of bleeding between both diagnostic procedures, and may potentially result in a mismatch regarding the target condition. We do not presume a substantial mismatch here, however an effect on test accuracy estimates cannot be excluded.

We restricted suitable reference standards to predefined imaging or invasive tests (i.e. CT, magnetic resonance imaging (MRI), laparotomy, laparoscopy, thoracotomy, thoracoscopy, autopsy), which ensured accurate estimation of sensitivities and specificities of POCS as an index test. However, we were unable to detect investigations that used MRI, laparoscopy, thoracotomy, or thoracoscopy as the single reference standard as specified by our inclusion and exclusion criteria, thus we could not evaluate the diagnostic accuracy of POCS compared to these diagnostic techniques. As a consequence, the diagnostic accuracy of POCS mainly refers to CT as a comparator rather than any other reference standard, and may thus limit the generalisability of this review.

Applicability of findings to the review question

In order to generate clinically realistic and relevant evidence, we kept our inclusion criteria fairly broad. Consequently, individual study and participant characteristics varied substantially, for example in terms of age, affected body region, target conditions, operators' expertise, hardware specification, etc. Unsurprisingly, while this variation led to marked heterogeneity in both sensitivity and specificity estimates between studies, it also enabled us to compare accuracy estimates across various settings and POCS applications.

We assessed concerns about applicability in participant selection, index test, and reference standards by using tailored questions in the QUADAS‐2 tool. Of the 34 included studies, 11 were associated with low concern about applicability (Benya 2000; Blaivas 2005; Coley 2000; Emery 2001; Ghafouri 2016; Kendall 2009; Nandipati 2011; Ojaghi 2014; Talari 2015; Todd Miller 2003; Zhang 2006). In the patient selection and index test domains, 29 studies showed low concerns about applicability. In patient selection, we assigned high concerns to five studies because of restricted inclusion criteria (i.e. only pelvic fractures (Friese 2007; Verbeek 2014); only minor, Menichini 2015, or only major trauma, Corbett 2000; or with limited organ lesions only (Kärk 2012)). We judged five studies to have unclear ratings in the index test domain owing to missing information about body areas examined (Calder 2017; Clevert 2008; Valentino 2010; Wong 2014; Zhou 2012). The conditional use of reference standards depending on the results of clinical observation, ultrasound examination, or participants' haemodynamic stability led to 16 high applicability rating concerns in the reference standard domain.

In summary, we rated 85% of all included studies as being of low concern for applicability in the patient selection and index test domains, whereas we rated only 47% of studies as of low concern in the reference standard domains. While the included spectrum of participants in this review may appropriately reflect the intended population, and the index tests used in the included studies may not differ considerably from those in clinical practice, the spectrum of reference standards may not correspond completely to the whole range of tests actually used in the setting of thoracoabdominal trauma.

Study flow diagram for the search conducted on 15 July 2017.
Figures and Tables -
Figure 1

Study flow diagram for the search conducted on 15 July 2017.

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study
Figures and Tables -
Figure 2

Risk of bias and applicability concerns summary: review authors' judgements about each domain for each included study

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies
Figures and Tables -
Figure 3

Risk of bias and applicability concerns graph: review authors' judgements about each domain presented as percentages across included studies

Coupled forest plots of sensitivity and specificity. TP = true positive; FP = false positive; FN = false negative; TN = true negative
Figures and Tables -
Figure 4

Coupled forest plots of sensitivity and specificity. TP = true positive; FP = false positive; FN = false negative; TN = true negative

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity of all 34 included studies. The solid circle represents the summary estimate of sensitivity and specificity. The summary estimate is surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.
Figures and Tables -
Figure 5

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity of all 34 included studies. The solid circle represents the summary estimate of sensitivity and specificity. The summary estimate is surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

Coupled forest plots of sensitivity and specificity for studies targeting only free fluid or free air (n = 22). TP = true positive; FP = false positive; FN = false negative; TN = true negative
Figures and Tables -
Figure 6

Coupled forest plots of sensitivity and specificity for studies targeting only free fluid or free air (n = 22). TP = true positive; FP = false positive; FN = false negative; TN = true negative

Coupled forest plots of sensitivity and specificity for studies considering both surrogates and organ lacerations (n = 7). TP = true positive; FP = false positive; FN = false negative; TN = true negative
Figures and Tables -
Figure 7

Coupled forest plots of sensitivity and specificity for studies considering both surrogates and organ lacerations (n = 7). TP = true positive; FP = false positive; FN = false negative; TN = true negative

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity: paediatric studies (n = 10; indicated in black) versus non‐paediatric studies (n = 24; indicated in red). The solid circles represent the summary estimates of sensitivity and specificity. The summary estimates are surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.
Figures and Tables -
Figure 8

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity: paediatric studies (n = 10; indicated in black) versus non‐paediatric studies (n = 24; indicated in red). The solid circles represent the summary estimates of sensitivity and specificity. The summary estimates are surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity: abdominal studies (n = 27; indicated in black) versus thoracic studies (n = 4; indicated in red). The solid circles represent the summary estimates of sensitivity and specificity. The summary estimates are surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.
Figures and Tables -
Figure 9

Summary receiver operating characteristic (ROC) plot of sensitivity and specificity: abdominal studies (n = 27; indicated in black) versus thoracic studies (n = 4; indicated in red). The solid circles represent the summary estimates of sensitivity and specificity. The summary estimates are surrounded by a dotted line representing the 95% confidence region and a dashed lined representing the 95% prediction region.

Main analysis set.
Figures and Tables -
Test 1

Main analysis set.

Sensitivity analysis set with lower sensitivity/specificity values in two original studies.
Figures and Tables -
Test 2

Sensitivity analysis set with lower sensitivity/specificity values in two original studies.

Population

Patients of any age and gender who sustained any type of blunt injury in a civilian scenario

Setting

Clinical evaluation at hospitals of any care level

Index test

Point‐of‐care sonography (POCS) as the primary imaging tool

Reference standard

Computed tomography (CT), magnetic resonance imaging (MRI), laparotomy, laparoscopy, thoracotomy, thoracoscopy, autopsy

Findings

  1. POCS emerged as an integral part of trauma algorithms, and remains the point‐of‐care imaging tool of choice for screening for thoracoabdominal bleeding in most regions of the world.

  2. Determining the diagnostic accuracy of POCS in patients with blunt trauma may provide clinicians with valuable information on the likelihood of chest and abdominal injuries and may contribute to decision making regarding the performance of subsequent diagnostic tests.

Limitations

  1. Methodological quality was hampered by severe under‐reporting in the included studies. We assessed risk of bias as unclear in more than half of the studies for the domains of patient selection and reference standard, and in one‐third of the studies for the index test.

  2. There was substantial heterogeneity among the results of the individual studies, which we investigated further by sources of heterogeneity (see summary of findings Table 2).

No. of participants (studies)

Summary sensitivity (95% CI)

Summary specificity (95% CI)

Summary LR+ (95% CI)

Summary LR‐ (95% CI)

Positive predictive value (95% CI)

Negative predictive value (95% CI)

Consequences in a virtual cohort of 1000a

Missed injuries

Overtreated

8635

(34)

0.74

(0.65 to 0.81)

0.96

(0.94 to 0.98)

18.5

(10.8 to 40.5)

0.27

(0.19 to 0.37)

0.88

(0.81 to 0.94)

0.90

(0.87 to 0.93)

73

(If 280 people suffer an injury through trauma, 207 will be identified as injured, and 73 will be missed.)

29

(If 720 people do not suffer an injury through trauma, 29 will be treated as though they had been injured, i.e. overtreated.)

Sensitivity analysis with a children‐only cohort

1384

(10)

0.62

(0.47 to 0.75)

0.91

(0.81 to 0.96)

6.9

(2.5 to 18.8)

0.42

(0.26 to 0.65)

0.76

(0.53 to 0.89)

0.84

(0.77 to 0.90)

118

(If 310 children suffer an injury through trauma, 192 will be identified as injured, and 118 will be missed.)

62

(If 690 children do not suffer an injury through trauma, 62 will be treated as though they had been injured, i.e. overtreated.)

aThe median prevalence was 28% for the complete study population and 31% for the children‐only cohort.

Abbreviations

CI: confidence interval
LR+: positive likelihood ratio
LR‐: negative likelihood ratio

Figures and Tables -
Summary of findings 2. Investigation of heterogeneity

Investigation of heterogeneity

Number of studies

Summary sensitivity (95% CI)

Summary specificity (95% CI)

Chi2a

P valueb

Reference standard

Single CT

25

0.75

(0.63 to 0.84)

0.97

(0.93 to 0.98)

0.18 (overall)

0.9160 (overall)

CT plus laparotomy

7

0.73

(0.58 to 0.84)

0.95

(0.87 to 0.98)

Target condition

Limited to free fluid/free air

22

0.78

(0.68 to 0.85)

0.97

(0.96 to 0.99)

9.10 (overall)

0.06 (sensitivity)

8.08 (specificity)

0.0106 (overall)

0.8100 (sensitivity)

0.0045 (specificity)

Free fluid/free air and

organ injuries/vascular lesions

7

0.80

(0.73 to 0.85)

0.88

(0.70 to 0.96)

Age of participant

Children

10

0.63

(0.46 to 0.77)

0.91

(0.81 to 0.96)

7.32 (overall)

54.91 (sensitivity)

19.88 (specificity)

0.0258 (overall)

0.0000 (sensitivity)

0.0000 (specificity)

Adults/mixed

24

0.78

(0.69 to 0.84)

0.97

(0.96 to 0.99)

Type of injury

Abdominal injury

27

0.68

(0.59 to 0.75)

0.95

(0.92 to 0.97)

17.36 (overall)

13.22 (sensitivity)

5.39 (specificity)

0.0002 (overall)

0.0003 (sensitivity)

0.0202 (specificity)

Thoracic injury

4

0.96

(0.88 to 0.99)

0.99

(0.97 to 1.00)

aLarge values of the Chi2 statistic indicate that test performance may be associated with the particular covariate.

bP values < 0.05 indicate statistical evidence that sensitivity and/or specificity differ between the examined groups.

Abbreviations

CI: confidence interval
CT: computed tomography

Figures and Tables -
Summary of findings 2. Investigation of heterogeneity
Table Tests. Data tables by test

Test

No. of studies

No. of participants

1 Main analysis set Show forest plot

34

8635

2 Sensitivity analysis set with lower sensitivity/specificity values in two original studies Show forest plot

34

8635

Figures and Tables -
Table Tests. Data tables by test