Introduction
Globally, the mortality rate of primary liver cancer ranks fourth among cancers. In many countries, its 5-year survival rate is less than 20%, and there have been no significant changes in survival rate over time [
1‐
3]. Hepatocellular carcinoma (HCC) is the most common histological type accounting for the highest proportion (75–85%) of primary liver cancers [
1]. Recently, progression has been considered to be the most important reason for the poor prognosis of patients with HCC, and identifying the possible factors affecting the progression of HCC and exploring potential interventional therapies will improve the prognosis of HCC patients [
4].
HCC is an extraordinarily heterogeneous malignant disease considering the tumours that have thus far been identified. During personalized treatment of tumour patients, intra- and intertumour heterogeneity present a great challenge since they may directly change the predicted biological markers related to diagnosis, prognosis and therapy. Even for tumours with identical histological features, variation in the expression of biomarkers among different patients and between different tumour areas of the same individual sample, such as tumour and peritumour areas, should be considered seriously.
In HCC, the existence of microsatellite nodules is a well-known risk factor. However, such nodules usually cannot be detected on imaging modalities [
5]. These satellite nodules undetectable by imaging may be risk factors for local recurrence since they are usually ignored by local ablation therapy. Therefore, analysing the risk factors associated with satellite nodules is very important to improve the treatment strategies for patients at high risk. Characterization of the tumour heterogeneity of HCC with satellite nodules using transcriptomic analysis may be important to reduce the incidence of local recurrence.
In this study, clusters defined by ST-related specific marker genes (differentially expressed genes; DEGs) were analysed from two perspectives. The feasibility and value of using ST to dissect inter- and intratumoural heterogeneity across HCC patient specimens with or without satellite nodules were demonstrated. In addition, a prognostic signature with DEGs were constructed in The Cancer Genome Atlas (TCGA) cohort and the stability and reliability of the model were validated in the International Cancer Genome Consortium (ICGC) cohort. Assessing the pathogenesis of tumours by ST might provide a better view of the landscape of HCC heterogeneity than traditional methods to facilitate personalized management, and ST cluster-related gene signatures can provide novel antitumour drug targets for HCC.
Materials and methods
Patient samples
Tissue samples were obtained from 3 patients who underwent hepatectomy because of HCC in August 2020 from the Tumour Tissue Bank of Tianjin Cancer Hospital. Pathologists confirmed all patients’ diagnoses. Detailed pathologic and clinical data are listed in Additional file
10: Table S1. The use of these tissue samples in this study was approved by the institutional research committee.
Slide preparation
There were six capture areas (3 tumour areas and 3 peritumour areas) (6.5 × 6.5 mm) of ST slides, each with 4999 capture spots of barcoded primers (10 × Genomics). The diameter of the spots was 100 μm. These spots were arranged in a rectangular shape. Each spot contained millions of oligonucleotides with the following features: a 30-nucleotide poly (dT) sequence for the capture of polyadenylated mRNA molecules; a 12-nucleotide unique molecular identifier (UMI) for the identification of duplicate molecules that arise during the library preparation and sequencing process; a 16-nucleotide spatial barcode, which was shared by all oligonucleotides within each individual gene expression spot; and a partial TruSeq Read 1 sequence for use during the library preparation and sequencing portions of the workflow.
Tissue preparation, fixation and staining
The 10 × Visium protocol was optimized for frozen tissue. Briefly, tumours were frozen in dry ice immediately after harvesting. Tumours were embedded with optimal cutting temperature (OCT) compound and cryosectioned at 10-μm thick. The sections on the capture areas were placed and incubated them at 37 °C for 1 min and then fixed them in methanol for 10 min at − 20 °C. To stain, sections were incubated in isopropanol (Millipore Sigma) for 6 min, Mayer’s haematoxylin (Dako, Agilent, Santa Clara, CA) for 7 min, bluing buffer (Dako) for 1 min, and eosin (Sigma-Aldrich) diluted 1:5 in Tris-base (0.45 M Tris, 0.5 M acetic acid, pH 6.0) for 1 min. The slides were washed with deionized water after each of the staining steps. After air-drying, the slides were mounted with 85% glycerol and then coverslipped them. Haematoxylin and eosin (H&E)-stained samples were photographed at 40 × magnification using a digital slice scanner (Hamamatsu). The coverslip was removed after imaging by immersing slides in RNase- and DNase-free water.
Tissue permeabilization, reverse transcription and spatial library preparation
To prepermeabilize the samples, sections were incubated at 37 °C for 24 min with permeabilization enzyme. The wells were washed with saline sodium citrate (SSC) (0.1 ×) (Sigma-Aldrich). SSC was removed, and reverse transcription Master Mix was added to each well. Reverse transcription was conducted according to the ST recommended protocol. After RT, sections were incubated in KOH (0.08 M) for 5 min at room temperature and then incubated in Second Strand Mix for 15 min at 65 °C. After the removal of Second Strand Mix, 100 μl Buffer EB was added, and the sections were incubated in KOH for 10 min at room temperature. The samples were transferred from every single well to a corresponding tube containing Tris–HCl (1 M, pH 7.0). Next, 1 μl of sample was added to the qPCR plate well containing KAPA SYBR FAST qPCR Master Mix (KAPA Biosystems). A qPCR system was used to determine the optimal number of cycles. After that, 65 μl cDNA Amplification Mix was added to the remaining sample. They were incubated according to the recommended protocol.
Library preparation and RNA sequencing
After the cDNA amplification products were qualified, the sequencing library was constructed with Library Construction Kit (10 × Genomics). First, the cDNA was chemically digested. The cDNA was cut into 200 ~ 300-bp fragments, and the cDNA fragments were segmented and subjected to terminal repair and adaptor ligation. The cDNA fragments were screened. The P7 adaptor was connected and introduced into the sample index by PCR amplification. Finally, the sequence library was obtained. Sequencing was performed on an Illumina HiSeq 3000/4000 with a 150 bp paired-end run by Quick Biology (Pasadena, CA). A data quality check was done on Illumina SAV files. Demultiplexing was performed with the Illumina fastq2 v 2.17 program.
RNA sequencing analysis
In this study, 10 × Genomics official software Space Ranger was used for data preprocessing, gene expression quantification and point identification. Sequencing data preprocessing included filtering the sequenced sequences, evaluating the quality of sequencing data, and calculating the sequence length distribution. The web-based ST spot detector software Space Ranger was used to identify the spatial barcode markers in Reads1 and UMI markers of different transcripts. Read2 was aligned to the genome using the transcriptome-specific alignment software STAR, and sequences with unique alignment positions were selected for subsequent analysis. The gene spot matrix was generated by using Visium spatial barcodes, and then point clustering and gene expression analysis were performed. Seurat software was used to analyse and cluster the samples. Low-quality data points were filtered out. Principal component analysis (PCA), including the t-distributed stochastic neighbour embedding (t-SNE) and uniform manifold approximation and projection (UMAP) algorithms, was used to reduce the dimensionality of the data and visualize the data.
Data quality control and normalization with Space Ranger
Sequencing data Read1 contained barcode and UMI markers that distinguished transcripts. The 10 × Genomics official analysis software Space Ranger was used to statistically analyse the UMI-related quality control information of each sample. To understand the biological significance of expression differences in different locations, it is necessary to classify the points according to the expression level. The points with similar expression levels may come from the same type of sample. Space Ranger software can be used to preliminarily classify different locations on the genome. First, the expression of all points was normalized to compare the expression. Then, t-SNE dimension reduction analysis based on PCA was carried out, and the t-SNE results were clustered.
Identification of tumour cluster-specific marker genes (DEGs)
Gene sets with FDR-adjusted P-values below 0.05 were considered significantly enriched in the related clusters, and these genes were identified as DEGs. Kyoto Encyclopedia of Genes and Genomes (KEGG) and gene ontology (GO) analyses were used to analyse the signalling pathways involved. Marker genes were identified based on the comprehensive analysis of the database and gene rank of the log fold change (FC) value of DEGs in the clusters. Pearson correlation analysis was used to reveal the relationship between cluster-specific genes and marker genes.
Clinical significance of tumour cluster-specific marker genes in HCC
The prognostic value of tumour cluster-specific marker genes was evaluated in the KM Plotter database. The genes were submitted to the website. Approximately 155 HCC patients were split into two groups according to the median value of marker gene expression from the gene chip. These survival of the two groups of patients was compared with a Kaplan–Meier (KM) survival plot.
Data collection (TCGA-LIHC cohort and ICGC (LIRI-JP) cohort)
RNA sequencing data and the corresponding clinical information of 376 patients with liver cancer up to July 13, 2021 were downloaded from the TCGA website (
https://portal.gdc.cancer.gov/repository). The RNA sequencing data and clinical information of another 260 patients were obtained from the ICGC website (
https://dcc.icgc.org/releases/current/Projects/LIRI-JP). The data from the TCGA and ICGC databases are public. Thus, the present study was exempt from the requirement for the approval of local ethics committees. The current research follows TCGA and ICGC data access policies and publication guidelines.
Construction and validation of a prognostic spatial cluster-related gene signature
Univariate Cox analysis of overall survival (OS) was utilized to screen tumour cluster-specific marker genes with prognostic value. P values were adjusted by Benjamini & Hochberg (BH) correction. Least absolute shrinkage and selection operator (LASSO)-penalized Cox regression analysis was performed to construct a prognostic model to minimize the risk of overfitting [
6]. The LASSO algorithm was used for variable selection and shrinkage with the “glmnet” R package. The normalized expression matrix of candidate prognostic DEGs was the independent variable in regression, and the dependent variables were the OS and risk status of patients in the TCGA cohort. The risk scores of the patients were calculated according to the normalized expression level of each gene and its corresponding regression coefficients. The formula was established as follows: score = esum (each gene’s expression × corresponding coefficient). The patients were stratified into high-risk and low-risk groups based on the median value of the risk score. Based on the expression of genes in the signature, PCA and t-SNE analysis were performed with the “Rtsne” and “ggplot2” R packages to explore the distribution of different groups. Survival analysis was implemented to analyse the OS of high- and low-risk groups using the “survminer” R package. The “survival ROC” R package was used to conduct time‐dependent receiver operating characteristic (ROC) curve analyses to evaluate the predictive power of the gene signature. The infiltration scores of 16 immune cells and the activities of 13 immune-related pathways between the high- and low-risk groups were calculated by single-sample gene set enrichment analysis (ssGSEA) with the “GSVA” R package.
Immunohistochemical staining and scoring (Additional file
1: Materials and methods).
Statistical analysis
The chi-squared test was used to compare the different proportions. Student’s t test was used to compare the image scores. The ssGSEA scores of immune cells or immune pathways between the high- and low-risk groups were compared by the Mann–Whitney test, and the P value was adjusted by the Bejamini-Hochberg (BH) method. KM analysis was employed to compare the differences in OS among different groups. Univariate and multivariate Cox analyses were performed to screen the independent predictors for OS. The correlation of the prognostic model risk score or prognostic gene expression level with the stromal score, immune score and drug sensitivity was tested by Spearman or Pearson correlation analysis. R software (Version 4.0.5) with the packages Venn, igraph, ggplot2, pheatmap, ggpubr, corrplot and survminer was used to create plots. For all statistical results, a two-tailed P value less than 0.05 indicated statistical significance.
Discussion
HCC treatment outcomes are still not promising because of the high recurrence rate even after complete surgical excision. The presence of satellite lesions missed by imaging because of their small size may be one of the key factors leading to high recurrence. The purpose of the present study was to perform a full analysis of the spatial transcriptome of HCC with satellite nodules. Tissue-wide gene expression heterogeneity was investigated using ST technology, which quantifies transcriptome arrays of whole tissue sections.
Recently, a study including multiple histological types of cancer suggested that adjacent tumour tissue may be an intermediate state between normal and tumour tissue [
11]. However, no evaluation of tumour and peritumour tissue with spatial resolution has been conducted until now. Therefore, we identified differences between the tumour and peritumour specimens. The results showed that the gene expression and pathway enrichment were different in tumour and peritumour clusters. For example, known HCC-related genes (
NDRG1 [
12,
13] and
VEGFA [
14]) are among the most highly expressed genes in the “tumour” clusters. These highly expressed gene profiles may become gene markers of HCC, suggesting poor prognosis. The present study also offers a new perspective into gene expression differences between tumours with and without satellite nodules, prompting key questions with important implications for the metastasis of HCC. The results revealed two tumour clusters in the case with satellite nodules. We identified marker genes corresponding to the different clusters of tumour cells. The gene expression profiles obtained from the ST analysis may be used to predict additional regions with satellite nodules.
PPIB, UGT2B15 and
IFI27, which were found in tumour cluster A, were mainly associated with cell survival and apoptosis.
PPIB (cyclophilin B, CypB) is a member of the PPIase family. It has been reported to play an important role in protein folding. Recent studies have shown that HCC cell survival can be stimulated by
PPIB through a positive feedback loop with hypoxia-inducible factor-1a (HIF-1a) [
15].
PPIB is associated with malignant progression, and gene regulation has been noted by some researchers [
16]. Overexpression of miR-206 promotes apoptosis and inhibits the metastasis of HCC cells by targeting
PPIB [
17].
UGT2B15 is a functional member of the
UGT2B subfamily. The expression of
UGT2B15 is mainly observed in liver, prostate and breast cancer. It has been found to contribute to glucuronidation of androgenic steroids [
18‐
20]. The role of UGT2B15 in inducing tumour progression and drug resistance has been reported in some studies [
20]. Bioinformatic analysis suggests that
UGT2B15 activates the Hippo‑YAP signalling pathway, leading to the pathogenesis of gastric cancer [
20]. Interferon alpha-inducible protein 27 (
IFI27) consists of 122 amino acids. It belongs to a hydrophobic mitochondrial protein family [
21].
IFI27 maintains a low level of expression in multiple mammalian cells and is involved in a wide range of biological processes, such as apoptosis and innate immunity [
22,
23]. The development of tumours can be affected by
IFI27 downregulation in many cancers. TRAIL-induced apoptosis in animal and cellular models of HCC and gastric cancer can be induced when
IFI27 is downregulated. This result suggests that
IFI27 may play a critical important role in tumour development [
24]. IFI27 downregulation results in a decrease in the formation of the cyclin A/CDK1 complex, inducing epithelial cell arrest in S phase, and cell proliferation is therefore inhibited [
25]. Hong W reported that
IFI27 upregulation promotes cell proliferation and invasion and reduces apoptosis [
26].
NDRG1, BHLHE40 and
VEGFA, which were found in tumour cluster B, mainly affect tumour proliferation, metastasis and invasion. N-Myc downstream-regulated gene 1 (
NDRG1) is a crucial cytosolic ubiquitously expressed protein.
NDRG1 is an important molecule in controlling HCC metastasis and is thus suggested as a novel biomarker for predicting HCC recurrence after liver transplantation [
13].
NDRG1 is a significant marker for metastasis, recurrence and poor prognosis in HCC [
13]. Yan et al. [
27] found that
NDRG1 expression is generally upregulated in HCC tissues compared with normal samples, particularly in recurrent and metastatic HCC.
BHLHE40 (also known as
DEC1/BHLHB2/SHARP2/STRA13) belongs to the basic helix-loop-helix (bHLH) protein family, which is a large superfamily of transcriptional regulators expressed in many organisms. High expression of
BHLHE40 is significantly correlated with the activation of a hypoxia-response pathway, elevated metastatic potential, and a poor prognosis in many tumours, such as HCC, pancreatic cancer, and invasive breast cancer [
28‐
30].
BHLHE40 is activated under hypoxic conditions by HIF-1α in HCC, stimulating tumour progression [
31]. Vascular endothelial growth factor (VEGF) is an essential angiogenic growth factor in physiological and pathological states. High expression of
VEGFA has been detected in a large number of solid tumours, including HCC [
32,
33]. It is involved in the regulation of the metastasis of many solid tumours and their neovasculature [
14,
34]. In HCC, VEGF is an extremely important angiogenic factor.
VEGFA secreted by tumour cells promotes an epithelial-to-mesenchymal transition phenotype, consequently inducing tumour invasion [
35]. In our study, spatial gene expression was measured in HCC tissues for the first time.
Spatially mapping gene expression in one case uncovered a new landscape [
36], and another two cases were used to validate the results. The results showed that these marker genes in tumour clusters were also highly expressed in the tumour area. We also detected the clinical significance of marker genes through the database and found that these genes can be used to predict prognosis and survival in patients with HCC. In the future, relevant panels can be detected, and molecules in different regions can help improve the accuracy of clinical predictions, of which cases of HCC may be at a higher risk of metastasis. High expression of the markers in both clusters may suggest that there are satellite nodules in tumours.
To further explore the role of spatial cluster-specific marker genes (DEGs), we screened the DEGs in the TCGA and ICGC cohorts. The expression of tumour cluster marker genes was significantly different between tumour and adjacent nontumour tissues, and many of them were associated with OS according to univariate Cox regression analysis. These results suggested that we could construct a prognostic model using cluster-specific marker genes. Patients were divided into high- and low-risk groups according to the median risk score. The results showed that high risk was significantly correlated with higher tumour grade, advanced tumour-node-metastasis (TNM) stage and shorter OS. Risk score was an independent predictor for OS according to independent prognostic analysis.
We constructed a prognostic model with 6 cluster-specific marker genes (
ADH1A, ADH1B, CYP3A4, FCGBP, PABPC1, NDRG1) in the present study. Some researchers have reported that a variation in alcohol dehydrogenase 1A (
ADH1A) may contribute to slow alcohol metabolism, which induces increased blood acetaldehyde levels in Korean subjects [
37]. In HCC patients, a high expression of
ADH1A was associated with good survival and a less aggressive disease state [
38]. Alcohol dehydrogenase 1B (
ADH1B) is mainly known for its involvement in the major human ethanol metabolic pathway [
39]. As a multifunctional enzyme, human hepatic cytochrome P-450 3A4 (
CYP3A4) has a wide range of substrates, including commonly used drugs [
40]. The expression of the Fc fragment of IgG-binding protein (
FCGBP) is low in some tumours and high in others [
41]. Low expression of
FCGBP could be used as a crucial regulator of tumour growth factor 1 (
TGF-1)-induced epithelial-mesenchymal transition in gallbladder cancer [
42]. In contrast, high expression of
FCGBP significantly decreases the OS of colorectal cancer patients [
43]. High expression of
PABPC1 is associated with worse OS for HCC and may contribute to the progression of HCC [
44]. As discussed above,
NDRG1 is a biomarker for metastasis, recurrence and a poor prognosis in HCC.
The results showed higher fractions of macrophages in the high-risk groups of the TCGA and ICGC cohorts. It has been demonstrated that increased infiltration of tumour-associated macrophages is associated with a poor prognosis in HCC patients due to their role in immune invasion [
45,
46]. In addition, a high risk score was associated with impaired antitumour immunity, as indicated by the activity of the type II IFN response and type I IFN response as well as the fractions of NK cells. Therefore, it is reasonable to assume that the antitumour immunity of the high-risk group is attenuated, which may be an important reason for their poor prognosis. According to the ESTIMATE algorithm, the expression of prognostic genes was also significantly correlated with the stromal score and immune score (P < 0.05), indicating that the tumour tissue in the high-risk group was highly infiltrated by immune cells. The analysis of the expression of prognostic genes and drug sensitivity showed that all prognostic genes were correlated with the sensitivity to some chemotherapy drugs. These data demonstrated that some prognostic genes can be used as therapeutic targets to overcome drug resistance or adjuvant drug sensitivity.
In conclusion, the results of this study have demonstrated that the analysis of tumour gene expression combined with ST remarkably increases granularity when compared to bulk analysis. Tumours with negligible histological differences and various regions of the same tumour showed significant differences in the transcription profiles of tumour cells at each site. High expression of PPIB, UGT2B15, IFI27, NDRG1, BHLHE40 and VEGFA may suggest the formation of satellite nodules that cannot be detected by imaging.
The study also defined a new prognostic signature consisting of six cluster-specific marker genes. We proved that the gene profile is independently associated with OS in the TCGA cohort and ICGC validation cohort and confirmed its value in the analysis of function, tumour microenvironment and drug sensitivity, providing insight for predicting the prognosis of HCC. The specific potential mechanism linking cluster-specific marker genes and tumour immunity in HCC remains unclear and is worthy of further study.
In summary, we propose that expression profiles based on spatial analysis can serve as new markers for the prediction of HCC prognosis.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.