Introduction
The second greatest cause of cancer-related deaths globally is hepatocellular carcinoma (Siegel et al.
2020). Hepatocellular carcinoma is now treated with surgery, chemotherapy, radiation, and liver transplantation (Chaabna et al.
2021). Nonetheless, the general prognosis of liver cancer patients is poor due to the invasion, migration, medication resistance, and uncertain diagnosis of cancer cells (Jiang et al.
2021). The advent of immunotherapy has brought prospect to patients with advanced HCC, and the tumor microenvironment (TME) in HCC has multiple capabilities that strongly influence tumor initiation and progression (Eggert and Greten
2017). A significant group of regulatory pathways known as immune checkpoints are essential for escalating inflammatory reactions and sustaining self-tolerance (Waldman et al.
2020). Advanced hepatocellular carcinoma patients have been demonstrated to benefit from immune checkpoint inhibitors (ICIs) (Sheng et al.
2020). Notwithstanding, the vast majority of patients have negligible or no clinical benefit from immune checkpoint blockade, far from meeting clinical needs (Newman et al.
2015; Angelova et al.
2015). Hence, thorough study of the pathophysiology of HCC and the identification of new therapeutic targets and prognostic biomarkers are of tremendous clinical significance for enhancing the efficiency of immunotherapy medications and enhancing the prognosis and quality of life of patients (Wang et al.
2021a).
Alternative splicing, a post-transcriptional process, produces alternative mRNA transcripts critical for normal development and contributes to the proteomic complexity of mammalian genomes (Chen et al.
2019). Growing evidence suggests that AS has become a major source of protein diversity in more than 90% of the human genome, a member of the important molecular markers of human cancer, and a promising target for novel cancer treatments (Baralle and Giudice
2017). Studies have reported that alternative splicing not only has a conspicuous correlation with tumor occurrence and development, invasion and metastasis, and treatment resistance, but also plays a noteworthy role in the formation of immune microenvironment (Qi et al.
2020). CPSF4 is a component of the CPSF complex and is essential for the maturation of the 3′-end and polyadenylation of mRNA. Recent genome-wide histone–RNA interaction studies have shown that regulation of pre-mRNA alternative splicing and alternative polyadenylation are two processes that closely are related, with 3′-end forming factors playing an vital role in alternative splicing (Misra and Green
2016), and it has also been reported that CPSF4 is implicated in the alternative splicing of Tp53 mRNA (Dubois et al.
2019). However, it is yet unknown how CPSF4 affects the onset and progression of liver cancer as well as the molecular system that controls it. More research is still required to determine whether CPSF4 can serve as a diagnostic marker or therapeutic target for liver cancer.
In this work, we investigated the biological role of CPSF4, its expression in HCC, and the prognosis of the disease. Seven key AS molecules related to CPSF4 were pointed out, and these molecules were not only related to the prognosis of hepatocellular carcinoma, but also related to expression of immune checkpoint genes and immune cell invasion. As a result of the research analysis, additional useful prognostic, diagnostic, and prospective therapeutic targets will be suggested. They include the involvement of CPSF4 and related AS molecules in the onset and progression of liver cancer.
Materials and methods
Data
For 371 HCC patient samples and 50 normal samples, the RNA-sequencing (RNA-seq) data and related clinical information were retrieved from the TCGA website (
https://portal.gdc.cancer.gov/repository). Fifty-six HCC patient samples with a time to live greater than 6 years and missing values were excluded. To normalize the gene expression profiles, we applied the scale approach offered by the “DESeq2” R package (Love et al.
2014). Another 240 tumor samples’ RNA-seq data and clinical details were received through the ICGC portal (
https://dcc.icgc.org/projects/LIRI-JP).
Functional annotation of CPSF4 gene
In the TIMER2 database, 33 different cancer types' CPSF4 gene expression was examined (
http://timer.cistrome.org/). Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses were performed based on the CPSF4 expression relevant genes (|
r| > 0.3,
p value < 0.05), utilizing the “clusterProfiler” R program between the tumor and normal groups. The BH approach was used to change the
p values. The R packages “survival” and “ggplot2” were used to perform the HCC survival analysis. IHC pictures of CPSF4 protein expression in healthy tissues and HCC tissues were acquired from the Human Protein Atlas (HPA) (
http://www.proteinatlas.org/) to compare differences in CPSF4 expression at the protein level.
Finding differentially expressed AS genes
Identification of differentially expressed genes (DEGs): The DESeq2 package was used to normalize the data for each gene expression profile into cpm values. Significant DEGs were selected using cutoff criteria p value < 0.05 and |logFC|> = 2. AS data were obtained for the identified DEGs in the TCGA Splice-seq database with sample percentages of PSI values: 100, minimum PSI range (increments between samples): 0, minimum PSI standard deviation: 0.
Construction of AS gene signature
First, univariate Cox regression analysis for OS was used to eliminate the predictive AS genes. The predictive AS genes based on lambda.min were then combined in a multi-gene signature using LASSO regression. Via tenfold cross-validations, the ideal value of lambda was determined. Using the R packages “survival” and “glmnet”, Cox and LASSO regressions with one variable were carried out (Ramsay et al.
2018). According to each patient's signature, the risk score was determined using the formula below:
$$\mathrm{Riskscore}=\sum \mathrm{i}=1\mathrm{n\beta i}\times \mathrm{Expi}$$
where is the gene's LASSO coefficient and Exp is a representation of the gene's expression. Based on the optimal cutoff value determined by the “surv cutpoint” function of the “survminer” R package, all samples were split into groups with high and low risk. This function determines the optimal cutpoint for continuous variables using the maxstat (maximally selected rank statistics) statistic.
Evaluation and verification of the AS gene signature
Using the R package “survivalROC”, a receiver operating characteristic curve was performed and shown to validate the discrimination of the signature. A Kaplan–Meier curve was produced using the R package “survival” to evaluate the signature's predictive ability in conjunction with a log-rank test for OS. Whether risk score was an independent predictive factor for OS in addition to the ICGC verification queue, univariate and multivariate Cox regression was used. The survival rates of the various risk groups were compared using the log-rank test and the K–M survival curve. The 1-year-to-2-year-to-3-year ROC curve was used to determine the prognostic signature's sensitivity and specificity (Heagerty et al.
2000).
Clinical analysis of AS gene signature
The survival study verified the difference in OS between the AS gene groups with elevated and decreased expression. An analysis of the Kaplan–Meier curve and the two-side log-rank test were used in each of the aforementioned survival studies. Immunohistochemical pictures of HCC from the HPA database were examined to continue evaluating differences in AS gene protein expression levels. We performed a correlation analysis of the tumor stage with risk score and the AS gene.
Tumor immunity analyses
The RNA-sequencing data were used to derive the expression levels of nine immune checkpoint genes thought to be potential targets for cancer immunotherapy. The expression levels of the 15 immune checkpoint genes were then compared between the two risk groups and AS genes using Wilcoxon analysis.
Examination of AS events
To analyze AS events in AS genes, the univariate Cox analysis was used. The correlation analysis was conducted to show the connection between the CPSF4 gene expression and the PSI values of survival-associated AS events.
Statistical analysis
Using the necessary packages, R 4.2.1 (
http://www.R-project.org) carried out all of the statistical analyses. The specificity, as well as sensitivity of the developed signature, was evaluated using ROC curves. To investigate the key prognostic factors, we employed univariate and multivariate Cox regression. The Fisher exact test or 2 test was developed to analyze the association between several variables for categorical data. Measurement data between groups were compared using the Student's
t test. Except as otherwise noted, statistical significance was defined as
p < 0.05.
Discussion
Nowadays, the main treatments for HCC are surgical resection and liver transplantation, but due to the frequent recurrence of HCC and the limitations of treatment methods and the poor overall prognosis of HCC patients (Sung et al.
2021), it is challenging to create a reliable predictive model for patients with HCC's overall survival. When compared to models based on single components, prognostic models based on a combination of novel prognostic biomarkers can enhance prognosis (Nault and Villanueva
2015; Torrecilla et al.
2017). In addition, the existing clinical treatment is mainly based on surgical resection, although the prognosis of patients is still poor. Thus, finding a novel clinical feature that is closely associated to the occurrence is urgent, and development of hepatocellular carcinoma to better predict recurrence, metastasis, and prognosis of patients. Recently, more and more studies have concentrated on the analysis of tumor AS genes as high-throughput sequencing technology and computer technology for biological information have advanced (Li et al.
2019; Marzese et al.
2018; Sciarrillo et al.
2020). Studying the mechanism and prognostic value of the AS gene in HCC is crucial since the importance of the AS gene in HCC is still unclear, particularly in HCC prognosis immunotherapy. Previous research has shown that CPSF4 is overexpressed in a number of cancer types and is related to prognosis (Zhang et al.
2021; Wu et al.
2019; Yi et al.
2016). However, previous studies on CPSF4 in liver cancer were limited to its polyadenylation mechanism (Wang et al.
2021b). It is necessary to further explore the regulatory role of CPSF4 on AS gene and its role in malignant phenotype, prognosis, and tumor microenvironment of liver cancer.
Seven AS genes identified by examining TCGA-LIHC data (STMN1, CLSPN, MDK, RNFT2, PRR11, RNF157, and GHR) were employed in this investigation to predict the prognosis of HCC with greater accuracy. STMN1 is a cytosolic phosphoprotein that regulates microtubule dynamics in response to cellular needs, and plays an important role in mitotic spindle formation and cell division (Rubin and Atweh
2004; Hu et al.
2020). The methylation of STMN1 is associated with the prognosis of HCC, and the expression of STMN1 is closely related to the change of m6A (Zhang et al.
2022). CLSPN, as the gene encoding Claspin protein, plays an important role in key cellular events such as checkpoint activation after DNA damage, DNA replication and replication stress response, DNA repair and apoptosis (Azenha et al.
2017,
2019). MDK is a growth factor, which participates in various physiological processes of organisms. Its overexpression in tumor tissues promotes the growth, migration, induction of EMT and multidrug resistance of tumor cells (Filippou et al.
2020; Hu et al.
2021). RNFT2 as an inhibitor of inflammation targeting IL-3 cytokine receptor IL-3Rα degradation, it may play an important role in the innate immune response chain of lung cancer (Tong et al.
2020). GHR is a membrane-bound receptor belonging to the class I cytokine receptor superfamily, which has been implicated in the development of many types of cancer (Zhu et al.
2022).
An independent predictive factor for HCC patients was found to be a seven-gene-based risk score after univariate and multivariate analysis of the clinical data from the TCGA-LIHC. Patients in the high-risk group had a considerably worse outcome than those in the low-risk group. Our model shows a high AUC between 1, 2, and 3 years and can accurately predict short-term survival in HCC patients, suggesting that our AS gene profile has a significant predictive advantage. In this work, a predictive model was created for predicting overall survival in HCC patients by identifying a novel five-gene signature through thorough data analysis. Yet there are several gaps in our investigation. First, it still has to be supplemented with an external validation dataset. Second, the seven genes' expression and prognostic implications at the protein level were not examined in this study. Third, more clinical validation is necessary to confirm the scoring model's dependability. The results of this investigation must, thus, be confirmed by subsequent clinical studies.
Conclusion
As a result of our research, seven CPSF4-related AS genes were shown to be predictive of the prognosis for patients with liver cancer. We then looked into the associations between prognostic factors including immune cell infiltration and tumor microenvironment. In addition to offering useful indicators and novel prognostic treatment targets, our findings will contribute to understanding the regulatory role of CPSF4 in AS events in liver cancer and the function of AS genes in malignancies.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.