Introduction
Hepatocellular carcinoma (HCC) is the most common primary liver cancer worldwide, and it ranks second among diseases responsible for cancer-related deaths [
1,
2]. More than 80% of HCCs develop as a consequence of liver cirrhosis [
3]. Thus, most patients with HCC have two different diseases: HCC and liver cirrhosis. Hence, it is essential to assess both tumor burden and remnant liver function in making optimal treatment decisions [
3,
4].
In addition to compromising liver protein synthesis, cirrhosis also leads to progressive changes in the splanchnic circulation [
5]. During cirrhosis, continuous tissue re-organizations lead to an increase in portal pressure. Portal hypertension ultimately leads to the development of gastroesophageal varices, ascites, and splenic volume increases [
6,
7]. Consequently, a high splenic volume is related to severe liver cirrhosis [
8]. Accordingly, the splenic volume has been identified as a highly sensitive prognostic parameter for patients with HCC undergoing resection or tumor ablation [
9‐
11]. In an initial study in patients with HCC that underwent transarterial chemoembolization (TACE), splenic volume was recently identified as a relevant prognostic factor [
12]. Progression-free survival and hepatic decompensation were not investigated. Moreover, the sample size was small and spleen volume was assessed manually [
12]. Manual splenic volume assessments on cross-sectional computed tomography (CT) images is a time-consuming task with a high risk of interrater variance [
13]. Thus, it is not feasible in daily clinical routine.
Fortunately, recent developments in the field of artificial intelligence, particularly deep learning, have provided knowledge about automated organ segmentation and volume assessments. These automated algorithms can be readily integrated into clinical workflows in real time [
14]. Hence, splenic volume might become an easily assessable and readily available prognostic factor for treatment planning and post-TACE follow-ups.
This study had two primary research goals: First, we aimed to build a deep-learning algorithm for fully automated splenic volume assessments based on CT images. Second, we aimed to validate the role of total splenic volume as a novel imaging biomarker for survival prediction and to investigate its role as an indicator for hepatic decompensation in patients with HCC undergoing TACE.
Discussion
This study was the first to assess the prognostic role of splenic volume for patients with HCC undergoing TACE in a Western patient cohort. Here, we developed a fully automated approach, based on deep-learning methods, for assessing splenic volume.
Manual assessments of splenic volume are time-consuming, and they run a high risk of interrater variance [
13]. Thus, we built a deep learning–based tool to assess splenic volume automatically, based on CT images. The U-Net architecture used for segmentation yielded a Sørensen Dice coefficient of 0.96 for training and 0.96 for validation. These coefficients indicated excellent algorithm performance. Additionally, the manual and automatic splenic volume assessments only differed by 0.1%.
The time consumption and technical challenges of manual splenic volume assessments [
13] have hindered their integration into clinical workflows, despite reports that splenic volume was a highly predictive factor for several cancer entities, including HCC [
9‐
11]. Historically, several surrogates have been proposed for rapid estimations of splenic volume. For patients with liver cirrhosis, the axial and craniocaudal diameters of the spleen have been identified as precise surrogates of splenic volume [
13]. Our results also indicated that these diameters were moderately to highly correlated with splenic volume. However, neither the craniocaudal nor the axial diameter was a relevant prognostic factor, because neither reached significance, even with optimal stratification, in our cohort. Thus, when deciding whether to use estimates for spleen size or true splenic volume for assessing risk in patients with HCC undergoing TACE, true splenic volume should be favored.
AI-based algorithms can potentially simplify the radiologist’s work in daily clinical routines [
14]. Tasks that can be readily simplified and automized include organ segmentation, volume assessments, and body composition assessments [
29‐
31]. Recently, deep learning algorithms have also been used for the assessment of splenic volume in the context of variceal detection [
32]. AI-based algorithms have the advantage of being easy to integrate into clinical workflows and automated quantitative reports can be automatically sent to the local image archiving and communication system. However, currently, those new technologies require evaluation in the context of clinical applications. Accordingly, specific use cases are mandatory.
While there is an initial threshold to install and train a segmentation tool, which includes a manual one-time labelling of a training dataset, this effort is reduced thanks to publicly available software programs and libraries. Using other software for segmentation and U-Nets than the ones used in this study would have likely produced similar effective results, and once trained, no further user segmentation is needed to get an accurate splenic volume.
The literature is scarce regarding the prognostic role of splenic volume for patients with HCC undergoing TACE. To date, only one recent study by Dai et al showed that splenic volume was correlated to the Child-Pugh classification and OS [
12]. The mean splenic volume of the 67 patients in that study was 300 ml, prior to TACE. That value was considerably lower than the mean volume of 551 ml in our patient cohort. In contrast to our study, the underlying etiology in all their patients was hepatitis B virus (HBV) infection. Unfortunately, they did not provide the number of patients with underlying cirrhosis. In general, most patients with chronic HBV infections and HCC do not have underlying cirrhosis. Thus, those patients are at lower risk of developing signs of portal hypertension, like an increased splenic volume. Accordingly, in that study, a smaller proportion of patients were in the high Child-Pugh class, compared to our cohort. Thus, those patients had better average liver function than the patients included in our study. All these factors might have explained the higher splenic volume in our patient cohort. Nevertheless, the two studies reported similar optimal cut-off values (373 ml vs 383 ml in our study) for high and low splenic volume.
Splenic volume was also significantly associated with both progression-free survival as well as hepatic decompensation and the likeliness to receive subsequent systemic treatment after TACE failure in our cohort. This is in line with prior findings that progression-free survival in TACE patients is linked to portal hypertension [
33]. Moreover, prior studies have linked repeated TACE to an increase in portal hypertension [
34] and have described the ALBI score as a predictor for failure of sorafenib treatment [
35,
36].
In our study, we found that splenic volume prior to TACE has a high sensitivity of identifying patients with a post-treatment ALBI increase. Therefore, our study is the first identifying splenic volume as relevant prognostic imaging marker for hepatic decompensation in patients with unresectable HCC. In the context of emerging novel treatment options for patients with unresectable HCC, the optimal time-point for a treatment switch in the concept of stage migration is hard to identify [
37]. However, a treatment switch is of utmost importance for the outcome of the patients as “an inappropriately high number of TACE sessions delays the switch to systemic therapy and may, in some cases, completely hinder the treatment switch due to the deterioration of liver function” [
38]. Thus, splenic volume might function as an additional, currently underused parameter to identify patients at high risk for hepatic decompensation and therefore might lead to a tighter follow-up scheme and more frequent interdisciplinary discussion of these patients. However, no standard reference values neither for impaired survival nor increased risk of hepatic decompensation are currently available. Thus, future large-scale multicentric evaluation studies are needed to determine a generalizable cut-off value.
The present study had several limitations. First, it was a single-center, retrospective study. However, the sample size was distinctly larger than that included in the previous study on this topic [
12]. Additionally, our dataset was well investigated and we only included patients with complete clinical, laboratory, and imaging data. Furthermore, missing values were not imputed. To avoid a time bias, we actively decided to include only patients from 2010 and later. These criteria minimized differences in the diagnosis and treatment decisions, which provided a more homogeneous study cohort. Furthermore, we excluded patients that underwent previous treatments to avoid other biases. Second, we included patients that underwent either conventional or drug-eluting bead-delivered TACE. However, several previous studies have shown that the TACE delivery technique did not influence the OS [
39‐
41]. Third, we only used an internal validation set to assess algorithm performance. In the final prediction for the whole dataset, the neural network failed to provide an accurate prediction of splenic volume in four patients (1.8%). We restricted the training and validation cohort to 100 patients, determined a priori, to limit the burden of manual segmentation. Nevertheless, the neural network facilitated correct splenic volume calculations in 98.2% of non-segmented spleens. Therefore, the evaluation of this use case was not substantially hindered by the need to perform additional manual segmentations of those four spleens with grotesque anatomies. Consequently, we encourage future studies to employ neural networks for segmentation in validating the prognostic role of splenic volume for patients with HCC undergoing TACE.
In summary, we showed that training a deep learning algorithm was feasible for allowing fully automated splenic volume assessments for patients with HCC undergoing TACE. Compared to established two-dimensional estimates of splenic volume, our algorithm provided precise splenic volume assessments, which showed superiority in predicting survival and high sensitivity in identifying patients with a risk of hepatic decompensation. Thus, true splenic volume could serve as an additional imaging biomarker, available fully automatically without additional effort for every CT study.
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.