Introduction
Hepatocellular carcinoma (HCC) is the fourth most fatal malignancy. HCC is a complex and multistep disease involving genetic and epigenetic alterations. The etiology and molecular mechanism of HCC remain largely unknown. Although progress has been made in its treatment, the prognosis of HCC is still unsatisfactory because of its extreme heterogeneity. Vascular invasion is associated with worse outcomes in hepatocellular carcinoma (HCC) [
1]. Both microscopic and macroscopic vascular invasion are associated with tumor recurrence and short survival times [
2]. The increased rate of HCC recurrence is partially caused by microvascular invasion (MVI) [
3].
Growing evidence has suggested that long noncoding RNAs (lncRNAs) play a critical role in the development and progression of HCC. It has been demonstrated that numerous lncRNAs associated with HCC are abnormally expressed and contribute to malignant characteristics [
4]. LncRNAs, whose transcripts contain more than 200 nt, can regulate gene expression. According to the progress in transcriptome sequencing over the past ten years, we know that more than 70% of the genome is transcribed, and the vast majority of the genome encodes lncRNAs [
5]. LncRNAs play a significant role in numerous biological regulatory systems. As a result, LncRNAs are significantly linked to the tumorigenesis, progression, and spread of malignancies [
6]. In addition, numerous studies have identified that lncRNAs can alter the intrinsic properties of tumor cells to remodel the tumor microenvironment [
7].
Increasing evidence has revealed that signatures related to vascular invasion show promising predictive value for the diagnosis, prognosis and treatment response evaluation of malignant tumors. Moreover, lncRNAs greatly contribute to the development of these signatures. Regrettably, the majority of signatures seem to be constructed based on the absolute expression values for individual RNAs or proteins. However, the accuracy and sensitivity of cancer diagnosis models can be improved by utilizing gene pairs [
8].
In the current work, we adopted a two-lncRNA combination strategy that does not require the absolute expression levels of lncRNAs to construct a lncRNA pair signature that correlates with vascular invasion. A signature based on 5 pairs of vascular invasion-related lncRNAs was constructed by using a novel modeling algorithm. Moreover, the risk score generated based on the signature was assessed for its correlation with diverse features, such as survival status, clinicopathological characteristics and chemotherapeutic efficacy.
Materials and methods
Data collection (TCGA-LIHC cohort) and differentially expressed analysis
The data including the clinical and RNA sequencing of 365 cases with HCC prior to 13 October, 2021, were obtained from the TCGA website (
https://portal.gdc.cancer.gov/repository). The TCGA databases provide publicly accessible data. As a result, the current research was free from requiring a consent of a local ethics commission. The present study complies to TCGA publishing and data access rules. Ensembl (
http://asia.ensembl.org) GTF files were obtained for annotation in order to discriminate between mRNAs and lncRNAs for further study. A genes set associated with vascular invasion was obtained from the GSEA dataset (M41805) and utilized to select lncRNAs associated with vascular invasion with a co-expression methodology. We used correlation analysis to explore the lncRNAs related to vascular invasion. LncRNAs were confirmed to be correlated with vascular invasion when the correlation coefficients larger than 0.4 and P values less than 0.001. We utilized the R package limma to do differential expression analysis within vascular invasion-related lncRNAs to determine the differentially expressed lncRNAs (DElncRNA). The cutoffs were defined at false discovery rate (FDR) 0.05 and log fold change (FC) > 2.
Construction of DElncRNA pairs
We established a 0-or-1 matrix by cyclically individually pairing DElncRNAs as followings: If lncRNA B has a lower level of expression than lncRNA A, then X is regarded as 1, else it is 0. Afterward, the 0-or-1 matrix was subjected to secondary screening. It was regarded a satisfactory match unless the expression quantities of 0 or 1 of lncRNA pairs accounted for greater than 20% of all matches.
Constructing a predictive model
Vascular invasion-related DElncRNAs having prognostic significance were identified using a univariate Cox analysis of overall survival (OS). This study adopted the least absolute shrinkage and selection operator (LASSO)-penalized Cox regression analysis to confirm a predictive model and reduce the possibility of overfitting. The "glmnet" R package was utilized for variable selection and shrinkage using the LASSO strategy.
The normalized expression levels of all gense and their matching regression coefficients were used to generate the risk scores for the patients. The following formula was developed: score = esum (each pairs’ expression×corresponding coefficient). Based on the optimal ROC cut-off value, the patients were classified to high-and low-risk subsubgroups.
Validation of the predictive model
We used the "survminer" R package and survival analysis to compare the overall survival (OS) of patients in high- and low-risk subsubgroups. Time-dependent receiver operating characteristic (ROC) curve studies were performed using the "survival ROC" R package to evaluate the gene signature's predictive ability. We conducted univariate and multivariate Cox regression analyses to identify if it is a favorable modle as an independent factor to predict prognosis. The R packages including survival, pHeatmap, and ggupbr were adopted in the process.
Evaluation of the significance of the model in the antitumour drugs
IC50 of commonly administered chemotherapeutic medicines in LIHC dataset from TCGA were assessed to evaluate the model's clinical applicability for treating patients with HCC. According to AJCC recommendations, sorafenib and other antitumor medications can be used to treat liver malignacy. We used Wilcoxon signed-rank test to assess the difference of IC50 between the high- and low-risk subsubgroups. The outcomes are presented as box plots through R's pRRophetic and ggplot2 packages.
Immune components of C1-C6
Statistical analysis
To evaluate the proportions, chi-squared analysis was employed. KM analysis was used to examine the variations in OS between the subgroups,. The independent variables for OS were screened adopting univariate and multivariate Cox analysis. Spearman or Pearson correlation analysis were performed to determine if the predictive risk score or prognostic gene expression level associated with the drug sensitivity. We made plots adopting R software (Version 4.0.5) with the programs Venn, igraph, ggplot2, pheatmap, ggpubr, corrplot, and survminer. For all findings, a two-tailed P value of less than 0.05 was determined to be statistically significant.
Discussion
In recent years, an increasing number of studies have aimed to construct signatures to predict the prognosis of patients with malignancies. The absolute expression levels of transcripts need to be detected for most of these signatures. In the present study, a decent perspective model was developed using two-lncRNA combinations, so absolute gene expression values were not needed for the signature. With this two-lncRNA combination model, only the relative expression level of the lncRNA pairs within the data needs to be considered, and there is no need for batch correction of differences between different kinds of data.
Although the relationship between vascular invasion and human cancer has been studied by some researchers, there are few reports on its correlation with immune components. The association between the risk score and immunological components was also investigated to better understand the role of the risk score in immune infiltration. The results showed that a high risk score was highly correlated with enrichment of the C2 cluster, but a low risk score was closely related to enrichment of the C3 and C4 clusters, suggesting that C2 induces tumorigenesis and progression, while C3 and C4 are favorable protective elements. This conclusion was consistent with earlier research since increased cytotoxicity can limit tumor incidence and progression (the immune phenotypes are numbered from 1 to 6 from lowest to highest relative abundance of cytotoxic cells) [
9].
Recent studies have improved our understanding of immune checkpoint expression in HCC and have indicated that immune checkpoint blockade could be a rational therapeutic approach even for HCC therapy [
13,
14]. High risk scores were shown to be correlated with high levels of PD1, PDL1, TIM3, ENTPD1, and TIGIT. PD-L1 is frequently highly expressed in cancer cells as a defense strategy, as this phenotype facilitates escape from immune surveillance. New treatments targeting immunological checkpoints, such as anti-PD-L1 antibodies, have demonstrated therapeutic effectiveness in a variety of tumors [
14]. T-cell exhaustion, characterized by decreased capacity of T cells to release cytokines along with upregulation of immunological checkpoint receptors (for example, PD-1 and CTLA4), has been reported in several tumors, including HCC [
15]. The expression levels of the immune checkpoint inhibitory molecules PD-1 and TIM3 in tumor-associated antigen-specific T cells from HCC specimens are higher than those in T cells from tumor-free liver tissues or blood. Strategies to block PD-L1 and TIM3 should be explored for the treatment of HCC.
Epithelial–mesenchymal transition (EMT) is a critical step in tumor progression and metastasis. ZEB1 and ZEB2 are structurally related E-box binding homeobox transcription factors that can promote EMT [
16]. To investigate the role of the risk score in EMT, the correlation between ZEB1, ZEB2 and the risk score was examined. The levels of ZEB1 and ZEB2 expression were considerably lower in the low-risk subgroup than in the high-risk subgroup according to the results. The levels of ZEB1 and ZEB2 expression were considerably lower in the low-risk subgroup than in the high-risk subgroup, suggesting that the risk score is a good marker for indicating EMT. In our previous study, we found that VEGFA, NDRG1 and BHLHE40 may suggest the presence of satellite nodules in HCC [
17]. To better understand the correlations between the risk score and the satellite nodules, the association between the risk score and VEGFA, NDRG1, and BHLHE40 was also investigated. The findings also suggested that the risk score is an effective indicator. HCC cells possess stem cell-like features, such as immortality, resistance to treatment, and transplantability [
18]. CD44 has already been validated as an informative marker of stem cells in primary tumors. To gain more insight into the role of the risk score in tumor stemness, the relationship between the risk score and CD44 was analyzed. The relationships of the risk score and CD44 were investigated to acquire a better understanding of the role of the risk score in tumor stemness. The results showed that CD44 expression was considerably higher in the high-risk subgroup than in the low-risk subgroup. The risk score was positively related to CD44 expression, suggesting that it is a good marker to detect tumor stemness.
Based on pathway analysis, tumor-related signaling pathways, such as the MAPK, NOTCH, TGF-BETA, WNT, and P53 signaling pathways, were considerably enriched in the high-risk subgroup. The involvement of these pathways has been associated with HCC, suggesting novel therapeutic targets [
19‐
21]. The correlation analysis between the predictive model and chemotherapeutics indicated that the risk score was correlated with sensitivity to chemotherapeutics such sorafenib, nilotinib, rapamycin, cisplatin, PD.0325901, and mitomycin C and erlotinib. Sorafenib was the only systemic therapy option for patients with advanced HCC for almost a decade. Nilotinib inhibits MYC and NOTCH1 expression in HCC cell lines, inhibits the growth of xenograft tumors in mice, and inhibits the formation of liver tumors in animals harboring MET and catenin β1 transposons, lowering MYC and NOTCH1 levels in tumors [
22]. Rapamycin, an mTOR inhibitor, can reduce the protumorigenic impact of VEPH1 knockdown and is an effective therapeutic option for patients with HCC [
23]. Cisplatin is a conventional chemotherapeutic agent. Mitomycin C promotes bystander killing in homogeneous and heterogeneous hepatoma cellular models [
24]. Erlotinib inhibits cell cycle progression and causes apoptosis of HCC cells while increasing chemosensitivity to cytostatics [
25].
Conclusion
In summary, this study revealed a novel predictive signature comprised of 5 vascular invasion-related lncRNA pairs. The signature was independently related to OS in patients with HCC and was verified to be effective in functional analysis. The risk score based on this signature was found to be related to the levels of important genes and immune checkpoint inhibitors and chemotherapeutic sensitivity, providing information for predicting HCC prognosis. External validation by other clinical datasets would be helpful, so we will collect new clinical specimens to increase the sample size for further validation in the future. Overall, this study provides promising insight into vascular invasion-related lncRNAs. The signature composed of 5 vascular invasion-related lncRNA pairs does not require the absolute expression values of lncRNAs and could be utilized for HCC diagnosis and prognosis evaluation, which suggests that it is valuable for the development of personalized cancer therapies.
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit
http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (
http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.