Introduction
Radionuclide imaging is a useful means of examining patients who may have metastasis of the prostate, breast or lung cancers, which are common cancers globally [
1,
2]. A typical screening method is bone scintigraphy, which uses Tc-99 m-methylene diphosphonate (MDP) [
3] or Tc-99
m-hydroxymethylene diphosphonate (HMDP) [
4] agents. Because visual interpretation of the bone scintigram lacks quantitative and reproducible diagnosis, quantitative indices have been proposed. Soloway et al. [
5] proposed the extent of disease (EOD), which categorises bone scan examinations into five grades based on the number of bone metastases. It is simple but not suitable for detailed diagnosis. Erdi et al. [
6] proposed the bone scan index (BSI), which standardises the assessment of bone scans [
7], and they presented a region growing-based semiautomated bone metastatic lesion extraction method to measure the BSI. However, the method is time-consuming and less reproducible because seed regions must be manually inputted.
Yin et al. [
8] proposed a lesion extraction algorithm using the characteristic point-based fuzzy inference system. Huang et al. [
9] presented a bone scintigram segmentation algorithm followed by lesion extraction using adaptive thresholding with different cut-offs in different segmented regions. An alternative approach for lesion extraction was proposed by Shiraishi et al. [
10], who presented a temporal subtraction-based interval change detection algorithm. Sajn et al. [
11] proposed a classification method to classify a bone scan examination into
no pathology or
pathology using support vector machine with features derived from segmented bones. Sadik et al. [
12‐
14] presented several algorithms which addressed skeleton segmentation, hot spot detection and classification of bone scan examinations. The algorithms in [
12] were improved, where an active shape model (ASM) was employed for skeleton segmentation and an ensemble of three-layer perceptrons was introduced for hot spot detection [
13], whose performance was evaluated with 35 physicians in [
14].
It should be noted that the aforementioned studies [
8‐
14] conducted hot spot detection and bone scan classification but did not assess BSI. One of the possible reasons for this might be low accuracy in the automated skeleton segmentation. For example, previous studies [
8,
9] outputted polygonal regions, which roughly approximated bone regions. Although the skeleton segmentation performance in the previous study [
12] was improved [
13] by the use of ASM, it was found to be sensitive to the initial position of the model and image noise. In addition, the whole skeleton could be divided into only four parts, each of which included several different bones. This type of approximation will degrade the accuracy of the measured BSI because coefficients as given in the ICRP publication [
15] used in the measurement differ in bones.
Some of the aforementioned problems have been solved using the atlas-based approach [
16], in which a manually segmented atlas consisting of more than ten bones was nonlinearly registered to an input image, and labels in the deformed atlas were transferred to the image. The atlas-based approach was also employed in other studies [
17‐
23], as were the commercialised computer-aided interpretation systems EXINIbone (EXINI Diagnostics AB, Lund, Sweden) and BONENAVI (FUJIFILM Toyama Chemical Co., Ltd., Tokyo, Japan). Accurate skeleton segmentation allows precise measurement of BSI [
18] and accurate classification of bone scintigrams [
17,
19‐
21]. Ulmet et al. [
18] reported that the correlation between manual and automated BSI was 0.80 using EXINIbone. Horikoshi et al. [
17] and Koizumi et al. [
21] evaluated the performance of BONENAVI and Pertersen et al. [
20] explored the performance of EXINIbone to demonstrate their effectiveness. Nakajima et al. [
19] compared EXINIbone and BONENAVI using a Japanese multi-centre database. Brown et al. [
22,
23] employed an atlas-based anatomical segmentation and proposed a new biomarker used in the commercially available system (MedQIA, Los Angeles, USA). The atlas-based segmentation is a promising approach but suffers from the problems of initial positioning of the atlas and differences in shape, direction and size between the atlas and skeleton of an input image. These problems might be solved by a multi-atlas-based approach [
24]. However, it is a time-consuming process, which is not acceptable for clinical use.
Deep learning-based approaches have recently emerged in the field of medical image analysis [
25]. This was initiated by the great success of an image recognition competition [
26]. Numerous novel technologies [
27‐
32] have been reported. For example, U-Net-type fully convolutional networks [
28,
29] are some of the most successful networks for medical image segmentation, which might be useful for skeleton segmentation and extraction of hot spots of bone metastatic lesion.
This study presents a system consisting of skeleton segmentation and extraction of hot spots of bone metastatic lesion followed by BSI measurement. We employed a deep learning-based approach to achieve high accuracy in skeleton segmentation and hot spot extraction. One of the reasons for the low accuracy of skeleton segmentation and hot spot extraction in existing studies [
6,
8,
14,
16‐
21] may be that anterior and posterior images have been independently processed, thus resulting in the inconsistent results. We used a butterfly-type network (BtrflyNet) [
30] which fuses two U-Nets into a single network which can process anterior and posterior images simultaneously. Because a deep and complicated network might be problematic for the training process, we introduced deep supervision (DSV) [
31] and residual learning [
32], both of which are effective at avoiding gradients vanishing or exploding during the training of a deep network. We conducted the experiment using 246 cases of prostate cancer and demonstrated the effectiveness of the proposed system by comparing it with conventional approaches, namely multi-atlas-based skeleton segmentation and U-Net-based hot spot extraction.
Conclusion
This study proposed a deep learning-based image interpretation system for automated BSI measurements from a whole-body bone scintigram, in which BtrflyNets were used to segment the skeleton and extract hot spots of bone metastatic lesions. We conducted threefold cross-validation using 246 bone scintigrams of prostate cancer to evaluate the performance of the system. The experimental results revealed that the best performance was achieved by a combination of BtrflyNet with DSV for skeleton segmentation and BtrflyNet with residual blocks, and the number of misclassified pixels for which was minimum. The computational time of both processes for a case was 112.0 s., and automatically measured BSI showed high correlation (0.9337) with the true BSI, both of which is deemed clinically acceptable and reliable.
An important future work will involve increasing the size of the training dataset to improve the misclassification of the osteoarthritis case. The effect of dataset size on performance would be an interesting topic. Optimising the hyper-parameters of deep networks, e.g., number of layers, number of channels (feature maps) and weights in loss functions, is also essential to boost the performance in terms of segmentation and extraction accuracy as well as computational cost. It would be interesting to perform a leave-one-out examination for further performance analysis. Developing an anatomically constrained network is also necessary to avoid anatomically the wrong results and to enhance the reliability of the system.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.