Introduction
Evaluation of pathological images is considered the gold standard for cancer diagnosis and prognosis [
1,
2]. Many pathological characteristics are useful in predicting the prognosis of colorectal carcinoma (CRC). Some of the histology cell features are important, such as the tumor characteristics, lymphocytes, stroma, and mucinous status on pathology images [
3‐
6]. The features of tumor tissue, including the histology differential grade, endophytic tumor configuration pattern, and tumor budding, were correlated with tumor recurrence in patients with stage II–III CRC [
3]. Stromal tissues with PD-L1-expressing immune cells have been reported to be associated with a favorable prognosis. In terms of histological segment features, the tumor-stroma ratio and tumor-lymphocyte infiltration have also been associated with prognosis [
7,
8]. Although there are many prognostic factors on histology whole-slide images (WSIs), pathologists cannot quantify the characteristics of histology images and annotate the tissue regions related to patient outcomes. Many computational methods have been proposed to predict survival using pathological images [
9]. Detecting and classifying cells on histopathological images would allow clinicians to predict patient outcomes, make precise decisions about therapies, and provide health care. However, obtaining clinically significant and explainable features from gigapixel WSIs with diverse tissue appearances remains challenging for an improved training model. Therefore, selecting image patches and segmenting tissues from WSIs to develop a survival prediction method are crucial.
Deep learning has been widely applied in pathological imaging tasks [
10,
11]. Survival prediction can be divided into region-of-interest (ROI) and WSI-based methods. ROI-based methods typically sample patches from the tumor area labeled by pathologists and use neural networks to extract features from the patches for survival prediction [
12‐
15]. Zhu et al. proposed a deep convolutional survival model (DeepConvSurv) to predict survival from pathological images [
12]. Pathologists annotated image regions within each tumor as the ROIs and sampled patches from the ROIs as the input for the DeepConvSurv model. However, the annotation process could be more laborious and time-consuming for clinical applications. In addition, the model can only obtain tumor features and cannot quantify the characteristics of other tissues, such as lymphocytes and stroma, because of the limitations of the labeled region. Thus, WSI-based methods have attempted to capture various tissue features from WSIs.
WSI-based methods usually first sample patches from WSIs and select survival-related patches [
16‐
20]. The models then extract features from the selected patches using neural networks and aggregate the features for survival prediction. For instance, Zhu et al. proposed a framework called the Whole Slide Histopathological Images Survival Analysis framework (WSISA) to predict survival using WSIs directly [
16]. WSISA adaptively sampled patches from WSIs and used K-means to cluster the patches [
21]. Each cluster was used to train the DeepConvSurv model [
12]. Clusters with better predictive power than random guessing [concordance index (
C-index) > 0.5] were selected for aggregation and prediction. Because gigapixel WSIs are too large to fit in the graphics processing unit (GPU), WSI-based methods use patches instead of WSIs to train deep learning models. However, extracting features from patches ignores the location and quantity of tissues and cannot capture clinically significant histopathological characteristics of WSIs. Recently, Li et al. used a graph convolutional neural network (GCN) to integrate spatial information from WSIs for survival prediction [
22]. However, the spatial information of a few patches cannot be used to represent the location and quantity of tissues.
To address these problems, we propose a survival prediction method based on histopathological and tissue area features extracted from WSIs. The histopathological features were extracted from patches of actual tissue types (tumor, lymphocytes, stroma, and mucus) using the DeepConvSurv model, and the tissue area features were extracted from the tissue maps of WSIs by localizing and quantifying the tissue region (tumor, lymphocytes, and stroma) using image-processing techniques.
Discussion
Our results highlight the following important points. (i) a total of 128 histopathological features were extracted from four histological types and five tissue area features from WSIs to predict colorectal cancer survival; (ii) our method performed better in six distinct survival models than the WSISA adaptively sampled patches using K-means from WSIs; and (iii) using a novel deep learning-based algorithm combining tissue areas with histopathological features, we demonstrated a clinically relevant survival prediction model.
We extracted histopathological features from selected tissue sets, whereas WSISA extracted features from K-means clusters with better predictive power than random guessing (C-index > 0.5). We observed that the selected K-means clusters might contain prognostically small patches due to the selection strategy, which could have adversely affected the survival predictions. The results showed that selecting a specific tissue set considering expert advice and model performance can better extract prognostically significant patches and predict patient survival. However, extracting histopathological information from patches can only obtain histological cell features. To capture the histology segment features from the WSIs, we extracted tissue area features from the tissue map. The results showed that tissue area features could enhance the prediction performance of the histopathological features.
From the known literature, we selected six popular models [
29‐
34], including three statistical methods and three machine learning methods. The six survival models manage features differently to better assess the prognostic power of the extracted features. Statistical methods can address linear relationships between features, while machine learning methods can obtain nonlinear relationships. The localization and quantification of WSI features provide a more objective method to evaluate slides. We located and quantified tissues using tissue area features that are prognostic and explainable. The survival curve of max_tumor_area (Fig.
4C1) showed that larger tumors led to poorer survival, consistent with the known tumor volume biomarker. We observed that not all lymphocytes had the same effects on survival. More lymphocytes inside tumors led to poorer survival (Fig.
4C2), whereas more lymphocytes around tumors led to better survival (Fig.
4C3). By calculating the ratio of these two features, we identified an influential prognostic factor, around_inside_ratio (
p value < 0.0001) (Fig.
4C4). The survival curve of total_stroma_area (Fig.
4C5) showed that more stroma leads to poorer survival. These factors could assist pathologists in making diagnoses. In our study, we used deep learning and image processing techniques to capture the areas of different tissues and considered them significant features. For example, tumor size is an important biomarker that is clinically relevant. We showed that these area features are prognostic and explainable.
In this study, we used eight clinical features, including adipose (ADI), debris (DEB), lymphocytes (LYM), mucus (MUC), smooth muscle (MUS), normal colon mucosa (NORM), cancer-associated stroma (STR), and colorectal adenocarcinoma epithelium (TUM). To determine which combination of tissues was most correlated with survival, six survival models were trained (LASSO-Cox, RIDGE-Cox, EN-Cox, SSVM, RSF, and GBRT). TUM, LYM, STR, and MUC were the most effective combinations (Table
1). Conversely, adipose tissue, debris, muscle, and normal mucosa were less correlated with survival (Additional file
1: Table S1). Many pathological characteristics can be used to predict the prognosis of colorectal carcinoma (CRC), including tumor characteristics, lymphocytes, stroma, and mucin content [
3‐
6]. Compatible with the clinical pathological findings, our selected four tissue types were also significant. We extracted 128 histopathological features from four histological types.
In recent studies, tumor-lymphocyte infiltration and the tumor-stroma ratio were also related to prognosis [
7,
8]. In addition, we found that max_tumor_area, lymphocyte_inside_tumor, lymphocyte_around_tumor, around_inside_ratio, and total_stroma_area were related to cancer survival. For example, lymphocyte_inside_tumor, lymphocyte_around_tumor, around_inside_ratio, and total_stroma_area were associated with the tumor microenvironment. There are some studies that show that fat invasion of colorectal tumors is a prognostic factor [
38,
39]. However, adipose tissue was less correlated with survival (Additional file
1: Table S1) in this study. We did not conduct the fat invasion of colorectal tumors study. In the future, we may focus on cancer-associated adipocyte or peritumoral fat invasion by computational pathology. This study quantified and extracted five tissue area features from whole slide images (WSIs). Finally, we added the five tissue area features to the 128 histopathological features from four histological types to predict cancer survival. The study aims to develop a computational pathology approach to extract tissue area features. We use public datasets that were not annotated by pathologists. Our results were not compared with annotations by pathologists.
DeepConvSurv was used to extract histopathological features in this study. If we apply novel models, such as Transformers [
40], CLAM, or Streaming, we might overcome the limitations of a patch-based method and improve the prediction performance. For example, transformers [
40] have been shown to improve the results of many tasks with the help of the attention mechanism. They require a large dataset or pretrained weights to train the models. However, we require the same model (DeepConvSurv model) used in the previous WSISA study to demonstrate the validity and importance of tissue area features. To obtain clinically significant and explainable features of tissue areas, we compared the performance among WSISA, histopathological features only, and histopathological plus tissue area features (Table
3). The performance improvement compared to WSISA with the same model was due to the power of tissue area features, rather than the model itself.
The ResNet50 tissue classifier has overall accuracy of 93%. For the tissue area, we used the closing operation, consisting of dilation and erosion, to reinforce the classification results by connecting objects that were inappropriately divided into many small pieces, which might have improved the segmentation performance. Pathologists’ annotations are time-consuming and labor-intensive for tumors, lymphocytes, stroma, and mucus, so the study aimed to quantify and extract prognostic features from WSIs without pathologists' labels. This study used the concordance index (C-index) as the evaluation metric. The C-index measures concordant pairs among patient pairs by comparing two patients’ survival times and prediction risks. A pair is concordant if the patient with the higher risk has a shorter survival time. Since the C-index evaluates performance from an overall patient perspective, we were not able to select an individual case in which the method did not predict well [
41].
For digital pathology, stain-normalization is important, especially in patches [
42,
43]. The trained datasets are normalization datasets in our study. However, the gigapixel whole slide images (WSIs) are too large to normalize. Normalization of the resection is also important. The normalization by the ratio of tumor patches is another method to make meaningful insights. We did not use the ratio of tumor patches. In the study, we use the whole slide image with the same 20X magnification. In clinical practice, actual tumor sizes were correlated with survival [
3]. By comparing the tumor patches at the same magnification, we can calculate the exact size of the tumor.
There are several limitations of this study. First, the size of the TCGA-COAD dataset is limited. More datasets should be used to validate the generalizations of the method. Second, the patch sampling rate was 5%, which might have caused some patches containing important information to not be sampled. For a gigapixel WSI, tens of thousands of patches could result in excessive training time. More efficient ways to sample significant patches should be further explored. Third, some new WSI-based survival prediction methods have recently been proposed and have performed well. These studies should also be used for comparison with the proposed method. Fourth, the patient may have serial pathology slides in clinical practice. In this study, our model is compared with the WSISA model [
16]. In the WSISA study, the number of WSIs and patients differed [
16]. One patient has two slides. Therefore, we apply the same overall survival label to a case with two slides. However, this might introduce some noise into the results.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.