Background
Hepatocellular carcinoma (HCC) is the fifth leading cause of malignant cancer and the third most common cause of cancer-related death worldwide [
1]. Despite the great improvement in earlier diagnosis and multidisciplinary cancer management, the long-term prognosis remains poor. Thus, an effective prognostic model that identify patients with a high risk of recurrence and metastasis could guide clinical management. Conventional models utilizing clinical tumor-node-metastasis (TNM) staging, vascular invasion, and other parameters help predict HCC prognosis [
2]. However, considering the great heterogeneity of HCC, the predictive efficacy of conventional models is still far from satisfying. It’s important to take molecular markers into key account when establishing novel predictive tools.
With the advance of genome-sequencing technologies, accumulating evidence shown that gene signatures at mRNA level had great potential in predicting HCC prognosis. For example, Long et al. established a four-gene-based prognostic model (including gene CENPA, SPP1, MAGEB6, and HOXD9) that accurately predicted overall survival OS using data from The Cancer Genome Atlas-Liver Hepatocellular Carcinoma Dataset (TCGA-LIHC) [
3]. Similarly, Zheng et al. identified another four-gene-based signature (including gene SPINK1, TXNRD1, LCAT, and PZP) for predicting the prognosis of HCC using data from the TCGA-LIHC and gene expression omnibus (GEO) database [
4]. Deep mining of publicly available genomic data tends to be an efficient method to identify novel robust gene prognostic signatures to guide patients’ prognostic stratification and personalized therapy.
In this study, we conduct univariate and lasso Cox regression analysis to identify novel prognostic biomarkers and established a prognostic six-gene signature using data from TCGA. Multivariate Cox regression analysis confirmed the independent prognostic role of our six-gene signature. Nomogram was established to predict HCC prognosis. Gene set enrichment analysis was performed to help explain the intrinsic mechanisms. In addition, the prognostic value of our six-gene signature was further validated in GSE14520 dataset from GEO database. Besides, the prognostic signature showed a strong ability for differentiating HCC from normal tissues. Collectively, our results suggest the six-gene signature and nomogram might help effectively predict overall survival of HCC patients.
Discussion
HCC remains a major challenge for public health worldwide. Conventional parameters such as TNM staging, vascular invasion, and AFP help predict HCC prognosis in some degree. However, considering the great heterogeneity of HCC, identification of novel prognostic biomarkers and establishment of more accurate prognostic models are urgently needed. And the combination of the prognostic gene signature with conventional clinical parameters may have better predictive efficacy than a single biomarker. Recently, gene-signatures based on aberrant mRNA have gained much attention and shown great potential in prognosis prediction of cancer [
3,
16,
17,
19].
In this study, we established a novel six-gene signature (including CSE1L, CSTB, MTHFR, DAGLA, MMP10, and GYS2) for HCC prognosis prediction. While CSE1L, CSTB, MTHFR, DAGLA, and MMP10 were found to be negative prognostic genes, GYS2 was found to do the opposite. The prognosis predictive performance of the signature was good not only in the TCGA HCC cohort but also in the GSE14520 cohort, and comparable with six previously reported models. The six-gene risk was an independent prognostic factor of HCC and patients in the high-risk group shown significantly poorer survival than patients in the low-risk group. ROC and DCA demonstrated that the nomogram combining the six-gene signature and conventional clinical prognostic factors performed the best in predicting short-term survival (1-year and 3-year) but not in long-term survival (such as 5-year) for patients with HCC. All these results indicated that the risk model developed from the six genes could be a useful indicator for HCC survival. Furthermore, GSEA revealed several significantly enriched oncological signatures and various metabolic process, which might help explain the underlying molecular mechanisms of the signature. And we found the risk score shown a strong ability in differentiating HCC from normal tissues, suggesting a great potential of utilizing the signature in HCC differential diagnosis.
CSE1L, also named as CAS (cellular apoptosis susceptibility protein), has been reported as an oncogene in several cancers [
20‐
22]. CSE1L is a multifunctional gene that participates in apoptosis, chromosome assembly, nucleocytoplasmic transport, microvesicle formation, chemo-resistance, and cancer progression [
20,
23,
24]. However, the role and mechanism of aberrant CSE1L in HCC remains poorly defined. CSTB is a reversible endogenous inhibitor of lysosomal cysteine proteinases [
25]. Mutations of CSTB leads to progressive myoclonus epilepsy (EPM1), which is an inherited and lethal autosomal disease [
26]. Dysregulated expression of CSTB has been implicated to be a useful biomarker in various cancers such as ovarian cancer [
27], esophageal cancer [
28] and breast cancer [
29]. Especially, CSTB was found to be overexpressed in most HCCs and was elevated in the serum of most HCC patients [
30]. DAGL (Diacylglycerol lipase) hydrolyzes diacylglycerol to 2-arachidonoylglycerol (2-AG) and free fatty acid (FFA) [
31]. Disruption of DAGL activity influenced the development of the central nervous system [
32]. Recently, Okubo et al. reported that DAGLA promoted tumorigenesis in oral squamous cell carcinomas by regulating cell-cycle [
33]. Roy et al. indicated that DAGLA participated in ovarian progression caused by loss of the endosulfatase HSulf-1 [
34]. Nevertheless, the role of DAGLA in HCC remains unclear. MTHFR catalyzes the 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate, a co-substrate for homocysteine re-methylation to methionine. Methionine is the forebody of
S-adenosylmethionine (SAM), and SAM is the direct methyl donor for the DNA methylation [
35]. Abnormal MTHFR activity leads to abnormal gene methylation, gene instability and finally cancer [
36]. Accumulating studies demonstrate that MTHFR polymorphism affects the susceptibility of various cancer, especially HCC [
37‐
41]. Matrix metalloproteinases (MMPs) are widely accepted as critical modulators for tumor microenvironment [
42]. MMP10 promoted HCC by involving in tumor angiogenesis, growth, and dissemination [
43]. Decreased glycogen concentration negatively correlated with tumor growth [
44]. GYS, the rate-limiting enzyme of glycogen synthesis, consists of two isoforms including GYS1 and GYS2. Loss of GYS2 caused glycogen storage disease type 0 [
45]. A very recent study revealed that decreased expression of GYS2 reduced glycogen and indicated unfavorable clinical outcomes of HCC. Mechanically, GYS2 suppressed tumor growth in HBV-related HCC via a negative feedback loop with p53 [
46].
To our knowledge, the six-gene signature related prognostic model and nomogram have not been reported previously and could be a useful prognostic and diagnostic classification tool of HCC. The risk score was based on mRNA expression but not somatic mutations or methylation status of only six prognostic genes. It could be more routine and cost-effective in practice as it decreased the necessity of whole-genome sequencing for all patients. Nomogram combining our signature with conventional clinical parameters like TNM stage shown significantly improved performance, especially in predicting short-term survival (1-year or 3-year), indicating a more accurate reflection of the great heterogeneity of HCC. However, several limitations of our study should be taken into consideration. Firstly, our study was mainly based on data from TCGA in which most patients were White or Asian. Extending our findings to other ethnic patients should be with great caution. Secondly, external validation of the six-gene signature and prognostic nomogram in more independent cohorts is necessary. Thirdly, the expression and the prognostic role of the six genes at protein level warrant further investigation. Forth, calibration plots showed that the nomogram (combined model) might under-estimate or over-estimate the mortality, efforts should be made to further improve the prediction performance. Fifth, all mechanical analysis in our study was descriptive, further functional experiments are needed to clarify the underlying mechanism of the six genes. Sixth, except its excellent performance in differentiating HCC from normal liver, the performance of our signature in differentiating between the normal liver, liver adenomas, focal nodular hyperplasia, and hepatocellular carcinomas remains to be further elucidated.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.