Original cnnrtibution
High-definition freehand 3-D ultrasound

https://doi.org/10.1016/S0301-5629(02)00735-4Get rights and content

Abstract

This paper describes a high-definition freehand 3-D ultrasound (US) system, with accuracy surpassing that of previously documented systems. 3-D point location accuracy within a US data set can be achieved to within 0.5 mm. Such accuracy is possible through a series of novel system-design and calibration techniques. The accuracy is quantified using a purpose-built tissue-mimicking phantom, designed to create realistic clinical conditions without compromising the accuracy of the measurement procedure. The paper includes a thorough discussion of the various ways of measuring system accuracy and their relative merits; and compares, in this context, all recently documented freehand 3-D US systems. (E-mail: [email protected])

Introduction

Advances in the resolution and quality of two-dimensional (2-D) ultrasound (US) imaging are increasingly enabling detailed examination of arterial and musculoskeletal anatomy Fornage et al 2000, Wang et al 1999. However, high-resolution US images (B-scans) have a limited field of view, generally sufficient for scanning the cross-section, but not the length, of the anatomy of interest. 3-D US can overcome this limitation. Not only does it provide the ability to generate extended images, it also allows the visualisation of complex structures, such as ligaments and cartilage or arterial plaque, in a much more intuitive way. A further advantage is that 3-D US, by its very nature, offers much more precise measurement of volume and the relative orientation of structures.

There are many ways to design 3-D US systems (Fenster et al. 2001). The most appropriate technique for arterial and musculoskeletal anatomy is freehand 3-D US, where the probe is moved by hand, and the resulting sequence of B-scans is located in 3-D space by either intrinsic (image-based) or extrinsic (position-sensing) means. This is the only technique that gives the clinician complete freedom to guide the probe along the path of the anatomy.

Most recently documented freehand 3-D US systems use either an electromagnetic Barry et al 1997, Edwards et al 1998, Prager et al 1999 or an optical Blackall et al 2000, Bouchet et al 2001 position sensor. B-scans are transferred from the US scanner to an external PC by digitising from the scanner’s video output Edwards et al 1998, Prager et al 2002, Meairs et al 2000, from a video recording (Barry et al. 1997) or by direct digital transfer (Berg et al. 1999). All position-sensing techniques, and most image-transfer protocols, introduce additional sources of error not present in the original 2-D B-scans: typical accuracy of documented 3-D systems is on the order of ± 2 mm. This is significantly worse than the inherent resolution of high-frequency B-scan images, which is better than 0.1 mm/pixel.

The position-sensing and image-acquisition techniques are only two of many steps that affect the eventual resolution of the 3-D system. A review of the engineering challenges in such systems, together with some practical suggestions for good acquisition protocols, is given by Gee et al. (2003). This review uses our system, Stradx (Prager et al. 1999), as a worked example. Stradx is a sequential freehand 3-D US system. In this paradigm, the (arbitrarily orientated, but only gradually varying) sequence of B-scans is preserved, rather than resampled onto a regular voxel array, so that each visualisation or quantification step is calculated directly from the original data. This is an ideal starting point for improving the definition of 3-D systems because we preserve the 2-D resolution for as long as possible.

In this paper, we present recent developments to Stradx that aim to improve the overall system accuracy, so that the 3-D definition can approach that of the original, high-resolution 2-D B-scans. The system is rigorously assessed by using a purpose-built tissue-mimicking phantom, allowing us to estimate the actual errors that might be expected when making 3-D measurements in real clinical situations. We also consider the practical aspects of using the system and investigate the degradation due to remounting the position sensor on the US probe, and changing the US machine’s depth setting, without carrying out a full recalibration.

Before considering how we might improve the resolution of a freehand 3-D US system, we need a clear picture of where the errors come from. An overview of the main sources of error is given in Fig. 1. These can be grouped into errors in the B-scan images themselves, the readings from the position sensor, temporal matching of B-scans and positions, location of the B-scan relative to the position reported by the sensor and errors in the 3-D reconstruction of the B-scans.

Errors in the B-scans themselves are largely determined by the size of the resolution cell, which varies in all three dimensions. Typically, the out-of-plane resolution (or beam width) is significantly worse than the in-plane resolution, and varies across the depth of the image, dependent mainly on the out-of-plane focusing. Variation in the speed of sound can also have a significant effect on the beam width: errors of ± 5% in sound speed, typical of variations in human tissue, can generate over 200% increases in beam width, as well as affecting the depth scale (Anderson et al. 2000). For high-resolution images, compression of anatomy due to probe pressure can be a large source of error, but this can be reduced by image-correlation techniques (Treece et al. 2002).

Of all the position-sensing techniques, optical position sensors are the most accurate, although they require a line of sight between the probe and the camera. Such systems can achieve a spatial accuracy (for the position sensor alone) of up to ± 0.2 mm (Blackall et al. 2000). Electromagnetic position sensors can achieve an accuracy of up to ± 0.5 mm in location and ± 0.7° in orientation (again, for the position sensor alone) when optimised for very specific situations and small spatial ranges (Barratt et al. 2001). In general use, however, they are subject to distortions that impair their accuracy considerably (Birkfellner et al. 1998).

Errors due to image transfer and temporal calibration (the matching of images to positions) are less widely discussed in the literature; most researchers opt for the practical solution of digitising the analogue video output of the US machine at between 10 and 25 frames per second. The images are corrupted by conversion to and from analogue video formats, and the temporal resolution is limited by the low frame rates. A notable exception is the system described by Berg et al. (1999), where images are transferred digitally at 150 frames per second; thus, giving a temporal resolution of 7 ms. Temporal calibration at 25 frames per second can be achieved to within 40 ms by looking for sudden changes in the image and position streams Meairs et al 2000, Prager et al 1999. Inaccurate temporal calibration will result in spatial errors with a magnitude dependent on the speed with which the probe is moved. Such errors can also reduce the spatial calibration accuracy.

Spatial calibration, the estimation of the rigid body transformation between the position sensor’s reference frame and the B-scan plane, is one of the most dominant sources of error in freehand systems. The various calibration techniques are compared by Prager et al. (1998). The calibration process involves scanning a known object from a variety of orientations; this can be a single point (Legget et al. 1998), a set of points (Berg et al. 1999), a crosswire Barry et al 1997, Meairs et al 2000, a “z-shape” (Bouchet et al. 2001), a real or virtual plane (Prager et al. 1998) or, in fact, any known shape (Blackall et al. 2000). By constraining the 3-D reconstruction to match the known geometry of the scanned object, it is possible to derive a system of equations for the eight spatial calibration parameters (six defining the location and orientation of the B-scan relative to the position sensor, and two defining the x and y scales of the B-scan in mm/pixel). The system of equations can either be inverted directly or, more usually, optimised iteratively.

Even after the location of each B-scan has been correctly determined, there are still further sources of error. An unspoken assumption in the subsequent 3-D reconstruction is that the subject has not moved during the acquisition; any such movements result in a distortion of the 3-D data. External movement can be ameliorated by attaching a coordinate reference to the patient (Chuang et al. 2001) and repetitive internal movement (i.e., due to cardiac activity) by the use of an electrocardiogram (ECG) to gate the acquisition of B-scans Belohlavek et al 1994, Palombo et al 1998. If possible, it is best to acquire data within a single breath-hold and to review it immediately for motion artefacts, so that the scan can be repeated if necessary (Gee et al. 2003). Where this is not possible, for instance, in acquiring dense ECG-gated data or where the patient has difficulty holding their breath, ECG or respiratory gating procedures must be used; however, these will inevitably reduce the accuracy of the 3-D freehand data.

Visualising the semistructured 3-D freehand data involves interpolating the data onto some sort of regular pixel or voxel array. Significant interpolation errors can arise as a consequence of the scanning pattern (Cardinal et al. 2000) combined with simplistic interpolation schemes, optimised for speed rather than quality (Rohling et al. 1999). Such errors can be limited by not resampling onto a regular voxel array, as described earlier. This approach also suppresses some interpretive errors; for instance, when delineating structures in artefact-ridden out-of-plane reconstructions (Bailey et al. 2001). Other interpretive errors arise from poor cursor placement when making measurements (Goldstein 2000).

There are many ways of assessing the performance of a freehand 3-D US system and, unfortunately, there is no agreed standard. More confusingly, results are generally quoted simply as system “accuracy,” despite differences in what was tested, where it was tested and how the results were analysed, which can lead to as much as a factor of 3 variation in the quoted result. Sometimes, insufficient information is provided to be able to interpret the quoted “accuracy” at all.

It is, therefore, necessary to clarify the differences between some of these measures before attempting to compare those systems that are described in the literature and place our system among them. This can, helpfully, be done by asking three questions: “what part of the system is included in the measurement,” “what is it a measurement of,” and “how are the measurements analysed?”

First, “what part of the system is included in the measurement?” With reference to Fig. 1, when designing a system, it might be helpful to know the accuracy of a specific part, for instance, the position sensor alone (Barratt et al. 2001), or the spatial calibration alone (Prager et al. 1998). Ultimately, however, it is the accuracy of the entire system, in the context in which it will be used, that is relevant to the clinician. In vivo accuracy is very difficult to assess, so in vitro accuracy is usually reported instead, by scanning a specially designed phantom in a water bath. This excludes some of the B-scan image errors, such as the speed of sound variation in human tissue and tissue deformation due to probe pressure. It also excludes some of the 3-D reconstruction errors, specifically those due to movement of anatomy and, to some extent, those due to interpretation of data (because phantom images are often significantly less complex than in vivo images). Clearly, the accuracy of the entire system can only be worse than that of its component parts.

Second, “what is it a measurement of?” There are a variety of possibilities here, in terms of the quantity measured, where it is measured and what it is compared with. Generally, the quantity is either the location of a fixed point Barry et al 1997, Blackall et al 2000, Bouchet et al 2001, Meairs et al 2000, Prager et al 1998, the distance between points Blackall et al 2000, Legget et al 1998, Prager et al 1998 or the volume of a defined object Barry et al 1997, Berg et al 1999. The location of a point can, perhaps, be regarded as a more fundamental measure because volume and, to a lesser extent, distance are not affected by certain distortions of the 3-D data. Where the quantity is measured is particularly important if the spatial calibration has been optimised from the same data used to assess the system accuracy (which is, unfortunately, common practice in the literature). If this is the case, then what is being measured is only the calibration residual error (Barry et al. 1997), but how well this reflects the actual system accuracy is highly dependent on how well-conditioned the calibration optimisation is, and how well the calibration scanning pattern represents actual scanning practice. Even a repeated scan of the point on which the calibration was based (Prager et al. 1998) can be misleading; it is better to assess accuracy based on a completely different set of measurements. Finally, measurements can either be compared to “true” values (known from some other independent source), in which case they reveal system accuracy, or to themselves, in which case they reveal only the precision of the system.

Third, “how are the measurements analysed?” As an example, consider a set of measurements of point location, with independent errors in each of the x, y and z dimensions that are normally distributed with zero mean and 1 mm SD. The 95% confidence limits in each dimension are approximately twice the SD (i.e., ± 2 mm). However, the absolute 2-D location error (for instance, in the xy plane) is not normally distributed: it follows a Rayleigh distribution. The absolute 3-D location error has an even more complex distribution. The mean absolute 2-D error is approximately 1.25 mm, and 95% of the points lie within 1.85 times this (i.e., within 2.3 mm of the true location). For the 3-D case, the mean error is 1.6 mm and the 95% limit is <2.8 mm (approximately 1.75 times the mean). Note that these confidence limits are lower than the pessimistic estimate from simply summing the variances in each dimension, as in Legget et al. (1998). The SD of the 2-D and 3-D errors is sometimes also quoted; this is a misleading quantity because these errors are not normally distributed and can lead to optimistic assessments of system accuracy. For the example above, the SD is approximately 0.7 mm, for both the 2-D and 3-D cases.

A further complication arises from the use of paired analysis Blackall et al 2000, Prager et al 1998, where a set of point measurements is analysed by considering the distribution of the absolute distance between all possible pairs of measurements. If we continue the example above, this analysis would give a mean 3-D error of 2.3 mm and 95% confidence limit of <4 mm, whereas we already know that 95% of the values will lie within 2.8 mm of the correct location. In effect, the paired analysis measures relative distance accuracy, which has twice the variance of the point-location distribution because it is a measure of difference.

Although it is not, in general, possible to compare accuracy results that differ with regard to the first two of these questions, it is possible to use the example above to convert between results that are analysed differently, provided sufficient information has been given to determine the nature of the result that has been presented. In the comparison below, results are converted to the “3-D confidence limit”; this is the distance away from the mean location (for precision) or known location (for accuracy) within which 95% of the measured points will lie. This conversion assumes an unbiased distribution of errors; significant bias in the results will introduce errors in this process.

The performance of a freehand 3-D US system is clearly dependent on the type of position sensor and the frequency of the US. Prager et al. (1998) used an electromagnetic position sensor and a 7 MHz probe at a 4 cm depth setting; Blackall et al. (2000) used an optical position sensor and a 10 MHz probe, again at a 4 cm depth setting; Meairs et al. (2000) used an electromagnetic position sensor and a 5–12 MHz probe; Legget et al. (1998) used an electromagnetic position sensor and a 3MHz probe; and Bouchet et al. (2001) used an optical position sensor and a 3.5 MHz probe. Systems using optical position sensors and higher-frequency probes will tend to be more accurate. Lower depth settings can also result in greater accuracy, depending on whether or not the B-scan is zoomed in the video display and how this relates to the actual resolution of the B-scan.

Since spatial calibration is such an important step in the design of an accurate system, several authors quote the point precision due to the spatial calibration alone. The 3-D confidence limits achieved for this value are <1.2 mm (Prager et al. 1998) and <2.3 mm (Blackall et al. 2000) (both derived from mean paired absolute error). Another frequently quoted value is the point precision of the entire system (as measured by scanning a phantom; note the earlier comments about in vitro measurements). 3-D confidence limits for this value are <2.7 mm (Prager et al. 1998), <1.4 mm (Blackall et al. 2000) and <2.6 mm (Meairs et al. 2000) (all derived from mean paired absolute error), <3.4 mm (Legget et al. 1998) (derived from the sum of the variances in each dimension) and <2.2 mm (Bouchet et al. 2001) (derived from the mean absolute error in each dimension). Finally, several authors quote the errors in distances between several points; this leads to a measure of accuracy within a particular data set that is sensitive to any distortion introduced into the data, but not to systematic errors in point location. 3-D confidence limits for point location accuracy based on this measure are <1.9 mm (Prager et al. 1998), <1.0 mm (Blackall et al. 2000) and <1.1 mm (Legget et al. 1998), all derived from the SD of the paired signed distance errors.

Section snippets

Physical layout

Figure 2 shows the physical layout of our freehand 3-D US system. B-scans are acquired with a Diasus US machine1, using 5–10 MHz and 10–22 MHz linear-array probes, on 2 cm, 3 cm and 6 cm depth settings. Eight-bit digital log-compressed data are transferred via Ethernet at 25 B-scans per second to an 800-MHz PC running Linux. The probe position is measured by a Polaris2

Calibration

Temporal calibration is necessary to determine the offset between the position sensor time-stamps and the B-scan time-stamps. A matching B-scan and position reading will not necessarily have the same time-stamp, depending on the latencies of the two data streams. Particularly pertinent is the latency of the position sensor, because the position time-stamps are applied, not by the position sensor (which has no clock), but by the PC each time it requests a reading from the sensor. There is an

Ultrasound phantom

To establish the performance of the system, a highly accurate tissue-mimicking US phantom was required. Phantoms consisting of very thin nylon wires embedded in a tissue-mimicking material are often used to determine the resolution of 2-D US machines. However, these phantoms are only designed to be scanned from one insonification angle. To test a freehand 3-D US system, we need a target that can be scanned from multiple angles, such as a small nonechogenic sphere. Phantoms containing such

Results

Wherever confidence limits are given in the following analysis, these are derived from unbiased estimates of the population statistics, allowing for the quantity of measured data and the number of parameters derived from these data during the calculation. In each case, the coordinate system is aligned with the phantom, so that x is along the rows of spheres, y down the columns and z out of the plane of the spheres.

Three probe frequencies and depth settings were used in the following

Summary

Figure 15 shows a comparison of the results for the highest definition system (10–22 MHz probe and 2cm depth setting) and the other systems cited in the introduction. The distance measurement accuracy for the system presented in this paper is approximately 2 times the point location accuracy because it is a measure of difference between two identical distributions.

As has been previously explained, the other systems’ accuracies were assessed in slightly different circumstances and quoted in

Conclusions

Our system can be used to locate points within a freehand 3-D data set to an accuracy of < 0.50 mm, using a 10–22 MHz probe on a 2 cm depth setting. The accuracy with which distances can be measured within a data set is approximately ± 0.7 mm. This accuracy can be achieved by using the temporal and spatial calibrations outlined in this paper and, subsequently, leaving the probe settings and position sensor mounting unchanged. It is valid for all practical freehand scanning patterns. The

Acknowledgements

This work was carried out under an EPSRC grant (GR/N21062). Dynamic Imaging Ltd. provided a modified US machine to enable digital data acquisition.

References (31)

Cited by (140)

  • Magnetic Resonance Imaging and Freehand 3-D Ultrasound Provide Similar Estimates of Free Achilles Tendon Shape and 3-D Geometry

    2019, Ultrasound in Medicine and Biology
    Citation Excerpt :

    The duration of each scan was ∼15–25 seconds, corresponding to ∼200–300 images of the cropped free Achilles tendon from the entire image stack. MRI data stored in DICOM format were converted to Stradwin (Treece et al. 2003) (version 5.4, Medical imaging group, University of Cambridge, Cambridge, England, UK) files, and image stacks were resliced at one-pixel resolution (0.271 mm) in the transverse plane. This step was performed to ensure that segmentation inputs provided to both imaging modalities were identical.

View all citing articles on Scopus
View full text