Introduction
-
We introduce a fully automated computer-aided diagnosis system for US renal imaging, which seamlessly integrates segmentation, detection of areas of interest, and global diagnosis into a single architecture.
-
The model incorporates a segmentation task as a regularizer for the main classification tasks, resulting in an improved performance during diagnosis.
-
The system integrates image- and region-based analyses at multiple resolutions to enhance the performance by jointly leveraging the advantages of both perspectives at different granularities. This multi-perspective approach has been shown to yield much better results than those of several state-of-the-art methods.
-
The proposed system has the ability to provide two complementary diagnoses: a binary (healthy vs. pathological) diagnosis and a multi-class diagnosis with two global categories (hyper-echoic cortex and poor corticomedullary differentiation) and four local categories (cyst, stone, hydronephrosis, and others). The local ones can be easily expanded to nine classes if the database is extended accordingly.
-
We additionally release our database, which will become the first public benchmark in the field of diagnosis from US renal imaging, promoting the advancement of knowledge in the field and contributing to the improvement of diagnosis and existing therapies.
Related Work
2D Kidney Ultrasound Segmentation
2D Kidney Ultrasound Classification
Method
Patient Population and US Image Acquisition
US Image Annotation and Interpretation
Category acronym | Taxonomy acronym | Type | Description |
---|---|---|---|
H | H | Global | Healthy kidney. Two concentric parts are distinguished: renal cortex, the darker external part, and renal sinus, the brightest internal part. |
PCD | PCD | Global | Poor corticomedullary distinction. In this case, renal cortex and sinus can not be distinguished correctly. |
HC | HC | Global | Hyper-echoic cortex. Renal cortex is hyper-echoic, which causes a low contrast in the internal part of the kidney. |
C | SCY | Local | Simple cyst. Simple cysts are usually hypo-echoic (darker), uniform, and spherical areas within the kidney. |
CCY | Local | Complicated cyst. Complicated cysts are very similar to simple ones, but can have a less uniform texture. | |
PYR | PYR | Local | Pyramid. Pyramids are kidney areas with a regular position, between renal cortex and sinus, and, if they are hypo-echoics may be a symptom of chronic kidney disease. They usually have a less spherical shape than the cysts. |
HYD | HYD | Local | Hydronephrosis. Hydronephrosis is a difficulty to remove the urine. Hence, the urine provoques hypo-echogenia in renal sinus, and in many cases, that the urine via becomes visible. |
O | LIT | Local | Lithiasis. Lithiasis appears as a hyper-echoic area (brightest) in the internal part of the kidney that shades a part of the image in the direction of ultrasound capture. |
ANG | Local | Angiomyolipoma. Angiomyolipoma is a benign tumor that appears as a hyper-echoic area in the US image, generally in renal cortex. | |
SRM | Local | Solid renal mass. It is a possibly malignant tumor that is hypo-echoic in appearance and is not easy to distinguish from cysts. | |
CT | Local | Cortex thinning. Renal cortex reduces its thickness in a specific part of the contour of the kidney. | |
CE | Local | Cortex eschar. Renal cortex has scars in some areas; it is not uniform. |
General Overview of URI-CADS
Segmentation, Classification, and Detection CNN (SCD-CNN)
Fusing Image- and Region-Based Predictions: Diagnosis Generation Module
-
Max-aggregation: for each pathology consider the maximum probability among those provided by the detected regions (for the regions belonging to each category k).$$\begin{aligned} \mathbf {p^r}_k=\max _{n | id_n=k}\left( s_n\right) , \ \ k \in [1,L]. \end{aligned}$$(2)
-
Mean-aggregation: considering the mean of the probabilities of the detected regions for each pathology.$$\begin{aligned} \mathbf {p^r}_k=\frac{1}{N}\sum _{n | id_n=k}{\left( s_n\right) }, \ \ k \in [1,L]. \end{aligned}$$(3)
-
LME-aggregation (Log-Mean-Exp): it is a intermediate version between max- and mean-aggregation.$$\begin{aligned} \mathbf {p^r}_k=\log \left( \frac{1}{N}\sum _{n | id_n=k}{e^{s_n}}\right) , \ \ k \in [1,L]. \end{aligned}$$(4)
-
Area-aggregation: taking into account both the area and the probability of each detected region with the kidney area as a reference.with \(h_n=y^{max}_n-y^{min}_n\) and \(w_n=x^{max}_n-x^{min}_n\) and \(\sum _{xy}{\textbf{K}}\) the kidney area, as the number of non-zero pixels in the binary mask.$$\begin{aligned} \mathbf {p^r}_k=\frac{1}{\sum _{xy}{\textbf{K}}}\sum _{n | id_n=k} s_n h_n w_n, \ \ k \in [1,L], \end{aligned}$$(5)
-
A score for a healthy kidney (first position in the vector). If a clinical case has a low score for every local pathology, its probability to be healthy must be high, and vice versa. Thus, the local probability for a clinical case to be healthy, \(p^r_0\), is computed as$$\begin{aligned} p^r_0=1-\frac{1}{L} \sum _{k=1}^L {p^r_k} \end{aligned}$$(6)
-
The probabilities for global pathologies at the end of the vector, which are all set to zero: \(p_k^r=0, \ k \in [L+1,P]\) as they are not considered in the local branch of our system.
-
Category-level fusion: the first strategy considers a global set of category-dependent \(\alpha _k\), which remain fixed for every image in the database. This approach provides an interpretable result of the importance of the global and local predictions for each category of the taxonomy, i.e., a local category k defined by small regions will have a corresponding smaller value of the \(\alpha _k\) parameter than the same local category characterized by bigger regions. The fusion parameter \(\varvec{\alpha }\) is defined as a parameter of the neural network and is learned through the loss \(\mathcal {L}_{dgm}\).
-
Attention-based fusion: attention mechanisms allow networks to focus on specific information in each situation. In our case, we propose to use attention to automatically set the value of \(\varvec{\alpha }\) according to the particular features of each clinical case. This strategy allows practitioners to analyze each case considering the specific \(\varvec{\alpha }\) weights estimated by the CAD system. In addition, we can still perform a category-level examination by analyzing the distributions of the \(\varvec{\alpha }\) parameter over the entire dataset. In particular, we have proposed a simple attention module in which \(\varvec{\alpha }\) is predicted by a linear layer working over the concatenation of global and local predictions:where the parameters \(W_{att}\) and \(b_{att}\) are learned using the loss \(\mathcal {L}_{dgm}\).$$\begin{aligned} \varvec{\alpha } \propto W_{att} [p^i; p^r] + b_{att} \end{aligned}$$(8)
Results
Experimental Setup
Assessment of the Aggregation Method
Type of aggregation |
\(\textbf{AUC}_{\mathbf {SENS-SP}}\) (%)
|
---|---|
Max-aggregation |
\(\mathbf {80.97}\)
|
Mean-aggregation | 79.31 |
LME-aggregation | 79.28 |
Area-aggregation | 80.59 |
Multi-pathological | Binary (healthy/pathological) | |||||||
---|---|---|---|---|---|---|---|---|
Method | HC (G) | PCD (G) | C (L) | PYR (L) | HYD (L) | O (L) | Average | H |
URI-CADS-I | 77.33 | 82.57 | 73.30 | 71.10 | 89.75 | 66.90 | 76.83 | 85.34 |
URI-CADS-R | − | − | 72.72 | 79.27 | 87.08 | 51.01 | 72.52 | 77.53 |
URI-CADS-C | 76.56 | 81.70 | 78.43 | 82.00 | 91.69 | 67.34 | 79.62 | 87.21 |
URI-CADS-Att |
\(\mathbf {78.65}\)
|
\(\mathbf {84.15}\)
|
\(\mathbf {79.59}\)
|
\(\mathbf {86.61}\)
|
\(\mathbf {93.04}\)
|
\(\mathbf {69.32}\)
|
\(\mathbf {81.90}\)
|
\(\mathbf {87.41}\)
|
Ablation Study and Analysis of the Fusion Parameters
Method |
\(\mathrm {\varvec{\alpha }_{{\textbf {H}}}}\)
|
\(\mathrm {\varvec{\alpha }_{{\textbf {HC}}} (G)}\)
|
\(\mathrm {\varvec{\alpha }_{{\textbf {PCD}}} (G)}\)
|
\(\mathrm {\varvec{\alpha }_{{\textbf {C}}} (L)}\)
|
\(\mathrm {\varvec{\alpha }_{{\textbf {PYR}}} (L)}\)
|
\(\mathrm {\varvec{\alpha }_{{\textbf {HYD}}} (L)}\)
|
\(\mathrm {\varvec{\alpha }_{{\textbf {O}}} (L)}\)
|
---|---|---|---|---|---|---|---|
URI-CADS-C | 1.0000 | 1.0000 | 1.0000 | 0.9904 | 0.7539 | 1.0000 | 0.8103 |
URI-CADS-Att* | 0.9833 | 0.8637 | 0.8281 | 0.9166 | 0.6563 | 0.8329 | 0.6736 |
Discussion
Comparison with the State-of-the-Art
Database (# of images) | DB ours (
\(\textbf{1985}\))
| |||
---|---|---|---|---|
Method | IoU/Dice (%) | |||
Deeplabv3+ [50] |
\(-/92.8\)
|
\(88.69/-\)
| 81.87/89.85 | 81.80/89.34 |
CT2US [32] |
\(-/95.2 \ (+2.4)\)
| − | − | − |
SDFNet [28] | − |
\(91.24 \ (+2.55)/-\)
| − | − |
Bnet [31] | − | − |
\(87.29 \ (+5.42)/93.03 \ (+3.18)\)
| − |
TN-SCUI2020 [51] | − | − | − |
\(79.31 \ (-1.49)/87.23 \ (-2.11)\)
|
URI-CADS-Segmentation | − | − | − |
\(\mathbf {84.99 \ (+3.19)}/\mathbf {91.23 \ (+1.89)}\)
|
URI-CADS | − | − | − |
\(81.41 \ (-0.39)/89.38 \ (+0.04)\)
|
2D Kidney Ultrasound Segmentation
2D Ultrasound Kidney Classification
Multi-pathological | Binary (healthy/pathological) | ||||||||
---|---|---|---|---|---|---|---|---|---|
Method | Measurement | HC | PCD | C | PYR | HYD | O | Average | H |
SUDHARSON-ORIG [16] |
\(\textrm{AUC}_{\mathrm {SENS-SP}}\) (%)
| 47.70 | 50.80 | 49.76 | 48.39 | 50.02 | 49.37 | 49.34 | 47.14 |
\(\textrm{SP}_{\textrm{SENS}-95}\) (%)
| 6.04 | 8.20 | 5.49 | 4.39 | 4.55 | 3.70 | 5.40 | 6.71 | |
SUDHARSON-IMP |
\(\textrm{AUC}_{\mathrm {SENS-SP}}\) (%)
| 59.29 | 63.95 | 52.10 | 53.99 | 67.79 | 58.18 | 59.22 | 64.37 |
\(\textrm{SP}_{\textrm{SENS}-95}\) (%)
| 10.43 | 12.67 | 6.90 | 11.57 | 9.17 | 11.56 | 10.38 | 29.19 | |
TN-SCUI2020 [51] |
\(\textrm{AUC}_{\mathrm {SENS-SP}}\) (%)
| 70.30 | 75.34 | 73.89 | 74.91 | 85.85 | 64.54 | 74.14 | 77.39 |
\(\textrm{SP}_{\textrm{SENS}-95}\) (%)
| 20.86 | 29.68 | 15.88 | 12.50 | 38.13 | 11.97 | 21.50 | 27.11 | |
URI-CADS |
\(\textrm{AUC}_{\mathrm {SENS-SP}}\) (%)
|
\(\mathbf {78.65}\)
|
\(\mathbf {84.15}\)
|
\(\mathbf {79.59}\)
|
\(\mathbf {86.61}\)
|
\(\mathbf {93.04}\)
|
\(\mathbf {69.32}\)
|
\(\mathbf {81.90}\)
|
\(\mathbf {87.41}\)
|
\(\textrm{SP}_{\textrm{SENS}-95}\) (%)
|
\(\mathbf {29.28}\)
|
\(\mathbf {43.63}\)
|
\(\mathbf {28.19}\)
|
\(\mathbf {48.19}\)
|
\(\mathbf {63.19}\)
|
\(\mathbf {14.05}\)
|
\(\mathbf {37.76}\)
|
\(\mathbf {60.59}\)
|