Background
PET is an important tool for in-vivo study of quantitative measurements of physiological, biochemical, or pharmacological processes. In thoracic oncology,
18F-FDG PET currently plays a major role in clinical diagnosis, staging, prognosis and assessment of response to treatment [
1]. Recent advances in PET and CT technology have improved image quality while reducing radiation exposure to patients [
2,
3], which open new avenues for cancer screening with PET and CT. Several studies showed that low dose CT is superior to traditional chest radiography for lung cancer screening and follow-up by detecting more nodules and lung cancers including early-stage cancers [
4,
5]. In addition, low dose CT screening for subjects at high risk could reduce lung cancer mortality [
6]. However, due to its limited specificity, low dose CT screening also detected more than 18 % of all lung cancers which were indolent and led to overdiagnosis in screening for lung cancer [
7] although computer-aided diagnosis could improve performance of CT screening [
8]. A large clinical study on CT screening of patients at risk has shown that even if cancer mortality is reduced by low dose CT screening, however, 24.2 % of the patients were tested positive, but 96.4 % of these were false positives [
6]. This large number of false positives calls for imaging techniques with higher specificity, in order to avoid unneeded invasive biopsy. Additional metabolic information from
18F-FDG PET has been shown to be more specific than CT in detecting lung cancer [
9]. Moreover, the combination of CT and PET demonstrated better performance in classifying solitary pulmonary nodules as benign or malignant than either PET or CT alone [
8]. Thus, the synergetic effect of PET and CT could potentially improve the accuracy of screening for lung cancer [
10]. Like low dose CT screening, the radiation exposure due to injected isotope should be minimized without compromising image quality of PET. The effective dose associated with
18F-FDG PET exam in this study was computed based on the reported ICRP values of 0.019 mSv/MBq for a 70 kg adult. For example, the effective dose is about 7 mSv for typical administration of 10 mCi
18F-FDG, which is much higher than that (1.5 mSv) of low dose CT protocol used in the National Lung Screening Trial [
11]. The continual improvement of PET imaging, such as introduction of point spread function and time of flight technologies, could allow for lower injected activities while minimizing impact on image quality [
12].
A number of studies to investigate the effect of different count levels on PET image quality with phantom have been reported [
13,
14]. Our previous study demonstrated count statistics as low as 5 × 10
6 counts could achieve a fairly high detectability level using a data set of
18F-FDG PET images of tuberculosis (TB) patients acquired on a PET/MR scanner [
15]. In this work, we aimed to assess the relationship between numbers of counts in PET scan and image quality with these data, based on image metrics such as liver signal-to-noise ratio (SNR), lesion contrast-to-noise ratio (CNR), bias relative to the “true value”, and ensemble noise in the image (lesion and normal tissue).
Discussion
We evaluated image quality with objective metrics including SNR in the liver, CNR in the lesions, bias and noise in the liver, normal lung and lesions, at simulated reduced doses, using 18F-FDG PET data at various count levels from TB patients. The underlying biology of TB is different from that of lung cancer, but for a technical study such as this, the uptake levels in TB lesions will be more representative of early stage lung cancer lesions than those in more advanced lung cancer. This work will lay the foundation to determine the appropriate dose or scan time for a future prospective study with lung cancer patients PET/CT scanning.
Accurate delineation of lesions is a prerequisite for quantification of FDG uptake. Although a large number of approaches have been proposed to segment tumors in PET images including threshold based, gradient based [
21], and fuzzy Bayesian based methods [
22], accurate tumor segmentation is still a challenging task. This is due to limited spatial resolution and the relatively high noise level in PET images, and this process evidently becomes more challenging with fewer counts (Zaidi and El Naqa, [
23]). A simple thresholding method was employed here to segment the solitary lesions in the lung using the full statistics images and the resulting VOIs were copied to the images at the lower count levels. This simple thresholding method may lead to imperfect delineation of the tumor. In addition, the spill-out from the tumor to the surrounding background can lead to lower CNR. However, the inaccurate delineation will not change the behavior of the image metrics since the error will have the same effect at the different count levels, which is partially supported by the result of CNR varying with different threshold (20, 40, 60 and 80 % of SUVmax). Since we work towards low dose PET imaging for those patients at high risk who have indefinite findings with low dose CT screening, VOIs can be delineated on the CT image and copied onto registered PET images.
Figure
1(a) is the plot of SNR
2 in the liver for all patients at all count levels and Fig.
1(d) shows the
K
i
for all patients derived from the SNR
2 in the liver for each patient at all count level. The SNR in the liver depends on many factors including scanner sensitivity, administered dose, scan time and patient-dependent parameter such as weight. In this study, after adjusting the effect of patient’s weight, the SNR
2 was found to have a very good linear relationship with detected counts (y = 1.00 × + 0.43, R
2 = 0.99), which fits with Poisson statistics of nuclear positron emission [
18,
19]. At very low counts, the Poisson statistics approximation does not hold anymore, and SNR
2 acquires a sharper decrease as shown in Fig.
1(b and c), and in Fig.
1(e and f). Since the SNR
2 in the liver will become nonlinear at very low count levels, the slope of the fit for the count range of 0–1 × 10
6 will be different from that of the fit at the count range of 0–20 × 10
6. As expected, also the CNR in the lesion decreases with decreasing counts. After being normalized to CNR at full count data, a function can be used to predict CNR at different count level. A heuristic curve that best fits the CNR was used. In this phase, no interpretative model is proposed for the behavior of the curve, but we applied the simplest function that could fit the experimental data and allow for the prediction of CNR at low counts.
A consequence of dose or count reduction is a possible bias in SUVmean or SUVmax measurement, and a larger error in the measurement, or degradation of measurement reliability, and an increase of noise in the image, which affects detectability of small lesions. This effect has been studied and a bias has been observed, as well as an increase of noise as COV, and an increase of STE of the SUVmean and SUVmax measurement in different regions of the patient. Several factors influence this including a positive bias in the cold background and negative bias in the hot regions associated with the positivity constraint of the OSEM reconstruction for SUVmean. The SUVmax is easily impacted by count reduction than SUVmean. In Fig.
5, one can observe larger error bars and SUV instability at very low counts.
As shown in Fig.
7, fewer counts in the scan correspond to higher COV, STE or error of the radioactivity measurement for both SUVmean and SUVmax, as well as a higher bias relative to high count rate. In comparison with SUVmean, more counts are needed to keep the same acceptable level for COV, bias and STE for SUVmax. In terms of counts required to obtain the same percent error, fewer counts are needed for lesions than for liver and lung background. For example, if the acceptable level for COV for SUVmean is 10 %, the corresponding required counts are 5 × 10
6, 25 × 10
6, and 25 × 10
6, respectively for TB lesions, liver and normal lungs (Fig.
7(a)). This can be explained by the inherent “real” local variations of uptake values in large VOIs in lungs and liver, which are clearly not uniform. In addition, the local variations can explain the different appearance of histograms in the Fig.
6. The number of counts to maintain the same COV for SUVmax are 20 × 10
6, 56 × 10
6 and 52 × 10
6 for TB lesions, liver and normal lungs (Fig.
7(d)). The actual variations add to the statistical noise. Apparently, bias for SUVmean is less sensitive to count statistics, and minimum bias can be reached even below 1 × 10
6 counts. In addition, the same acceptable level for STE regarding to SUVmean can be maintained with fewer counts as compared to COV, which is demonstrated by 0.4 × 10
6, 2.7 × 10
6 and 1 × 10
6 counts required to reach STE threshold of 10 % for TB lesions, liver and normal lungs, respectively (Fig.
7(b)).
Preliminary investigations show that the behavior of SUVmax is similar, but SUVmax is much more sensitive to noise, and comparable levels can be reached only with much higher number of counts in the scan. An additional study was done by splitting the lesion group into two subgroups: small lesions, with volume smaller than 5 ml; and large lesions, with volumes greater than 5 ml. Each subgroup has 10 lesions. As demonstrated in Fig.
7, for a given number of counts in the scan, bias, COV and STE are larger for smaller volume lesions. Regarding to SUVmean, choosing a target number of counts of about 5 × 10
6 counts, a noise level (COV) of 9 % is obtained for large lesions, and of 12 % for small lesions. A STE of 4 and 2 % are reached at 5 × 10
6 counts for small lesions and large lesions, respectively. Finally, the bias at 5 × 10
6 counts for both large lesions and small lesions are less than 2 %. At the same count level of 5 × 10
6 counts, lesion CNR is about 60 % of value at the full statistics level (data in Fig.
3(d)) and SNR in the liver is about 3 (data from Fig.
1(b)).
The independent realizations at different low dose were obtained by randomly discarding the events in the list mode data stream, based on the desired count level and thus these realizations are not fully independent. In addition, we got very similar results with bootstrap resampling [
24,
25], which was considered as a better method to produce independent realizations. For example, the fitted function for the SNR
2 in the liver for the images reconstructed with fewer than 1 million true counts
y = 2.85
x + 0.19,
R
2 = 0.81, which is close to that function (
y = 2.9
x + 0.20,
R
2 = 0.79) with the current simulation strategy. This could be due to the fact that most of the image metrics in this work were based on the mean value across these realizations. Therefore, in order to keep in line with our previous study [
15], the results with the simulations by randomly discarding the events in the list mode data stream were presented in this work.
In the earlier works by Budinger, TF, et al. [
26], Hoffman, EJ, et al. [
27] and Strother, SC, et al. [
28], the relationship between image counts and noise (root-mean-square) had been investigated. These earlier works used a different reconstruction scheme and our findings cannot be compared directly. Notwithstanding this, the statistical noise (COV) in the liver for each subject at the different count level in this study was close to the root-mean square calculated the equation found in the earlier studies up to the count level of 5 million.
In this work, we chose OSEM reconstruction with 3 iterations, 21 subsets and post reconstruction Gaussian smoothing with FWHM of 5 mm which are commonly used parameters in the clinical settings. In future, we will investigate the optimization of the reconstruction parameters with prospective lung cancer patient data once the impact on image quality of reducing counts is generally understood. In addition, the impact of inaccurate attenuation map on quantitative PET lung imaging due to respiratory motion will be thoroughly explored with the data at different counts levels.