Background
According to the Global Cancer Statistics report of 2018, liver Cancer became the sixth most commonly diagnosed cancer and the fourth leading cause of cancer death in the world in 2018 [
1]. The highest incidence (mortality) of liver cancer is in East Asia, accounting for 35.5% of the global total. The main risk factors for liver cancer are chronic hepatitis B virus (HBV) [
2‐
4], hepatitis C virus (HCV) [
5‐
7], aflatoxin-contaminated food [
8], heavy alcohol consumption [
6,
9,
10], obesity [
11], smoking [
12] and type 2 diabetes [
13,
14]. According to statistics, the risk factors of liver cancer formation are different in 53 countries and throughout different regions in the world. In most high-risk areas such as China and East Africa, chronic HBV infection and aflatoxin exposure are the main determinants of liver cancer. In contrast, HCV infection is the leading cause of liver cancer in Japan and Egypt [
15,
16]. For low-risk liver cancer areas, an increase in obesity rates is the leading cause of the increase in liver cancer case.
The internationally recognized TNM cancer staging method divides cancers into stage I, II, III and IV [
17]. Also, work on the topic has previously divided cancer into early, middle and late stages. Corresponding to TNM stages, phase I is early-stage, phase II and III are middle-stages, and phase IV is late-stage. Most cancers are diagnosed at the late stage and this holds especially true for liver cancer. Modern medical research has shown that there is no pain sensation in the liver and even if any liver disease had started, the body can’t feel or recognize it through a pain-feedback mechanism. Hence, the clinical manifestation of liver disease is very slight, most patients with liver cancer are diagnosed at a late stage owing to a lack of timely symptom manifestation and identification [
18‐
21]. The cure rate of early-stage liver cancer is very optimistic, therefore if a diagnosis can be made in any stage before stage IV, the treatment of the cancerous mass will be less intense as it would be for the final stage.
Alpha-fetoprotein (AFP) is currently the only clinically used biomarker for the early diagnosis of liver cancer. AFP was discovered more than 50 years ago and is not a very accurate diagnostic biomarker for liver cancer. 32 to 59% of liver cancer patients have been shown to have normal AFP levels [
22]. Therefore, finding new diagnostic biomarkers of liver cancer is of great significance for accurate diagnosis. For cancer patients, the prognosis and survival time of cancer is of utmost importance for improving the quality of life of patients, as well as the diagnosis and treatment scheme adopted. Currently, therapeutic indications for the treatment of liver cancer are more concerned with tumor size and the number of nodules and less concerned with its aggressiveness to spread [
23]. Compared with a small and aggressive liver cancer node, patients with multiple large but non aggressive liver cancer nodules may have a better prognosis, hence it may be assumed that the current prognostic criteria are not accurate or the best for prognosis. If new genes related to the prognosis of liver cancer can be identified, it will hold large positive ramifications for both treatment and the improvement of patients’ quality of life. In this scientific work, the data of liver cancer patients in TGCA and GEO databases were taken as search criteria to identify diagnostic biomarkers and prognostic biomarkers of liver cancer through data mining. The aim is to improve the accuracy of the early diagnosis of liver cancer, achieve early detection and treatment and thus reduce mortality. At the same time, through the accurate judgment of the prognosis of liver cancer patients, adjuvant treatment to determine the plan of action could be streamlined.
Discussion
Most patients with liver cancer do not seek medical treatment until they have symptoms in the late stage of liver cancer, therefore the early diagnosis of liver cancer is of great significance for treatment. At present, alpha-fetoprotein (AFP) is a diagnostic biomarker used in the clinical diagnosis of liver cancer. AFP was discovered 50 years ago as a diagnostic biomarker of liver cancer and currently, there are problems associated with the inaccuracy of diagnosis. According to investigations, 32 to 59% of liver cancer patients have normal AFP levels [
22]. Therefore, it is necessary to find new and more accurate biomarkers for liver cancer diagnosis. Also, the prognosis of cancer patients is of great significance to the quality of life and treatment of patients. Therefore, the search for prognostic biomarkers is also of great significance for tumor patients. In order to achieve this goal, this scientific work uses data mining analysis to find diagnostic biomarkers and prognostic biomarkers associated with liver cancer.
First, liver cancer data sets from the TCGA database were obtained which included 50 normal liver tissue samples and 371 liver cancer samples. The GSE25097 dataset was obtained from the GEO database consisted of 243 non-tumor tissue samples and 268 liver cancer samples. After DEGs analysis, 102 Common DEGs were obtained from TCGA and GSE25097 data sets. GO analysis was then conducted and Reactome Pathway analysis was used to conduct enrichment analysis on 102 Common DEGs, The results showed that liver cancer showed changes in collagen at the cellular level, changes in hormone metabolism and reaction to metal ions at the biological function and abnormalities in molecular binding and oxidoreductase activity at the molecular level (Fig.
3).
A PPI network was constructed for 102 Common DEGs to find the correlation between genes and 22 Hub Genes were screened from 102 Common DEGs based on Degree value (Table
2). ROC curve is a curve reflecting the relationship between sensitivity and specificity, which is of great significance for the accurate diagnosis of diseases [
26]. A ROC curve was used to analyze 22 Hub Genes with AUC greater than 90% as the threshold and this resulted in 16 Hub Genes. They were SPP1, AURKA, CXCL12, FOS, NUSAP1, TOP2A, UBE2C, AFP, DCN, GMNN, PTTG1, RRM2, SOCS3, FOSB, PCK1 and SPARCL1. The expression levels of the 16 Hub Genes in liver cancer can accurately distinguish normal liver tissue from liver cancer, therefore the 16 genes can be used as diagnostic biomarkers of liver cancer for the early diagnosis of liver cancer (along with AFP which is currently used in clinical practice). At the same time, the effect of the 22 Hub Ggenes on the survival time of liver cancer patients was observed and the risk coefficient was calculated. It was found that the expression levels of ESR1, SPP1 and FOSB genes in the 22 hub genes had a significant impact on the survival time of liver cancer patients(
p < 0.05), with HR values of 0.88, 1.1 and 0.88, respectively, indicating that ESR1 and FOSB are low-risk genes while SPP1 is high-risk gene. However, the AUC value of ESR1 is 68.7%(Fig.
6a), which showed that the accurate diagnosis rate of ESR1 gene is low and not suitable for use as a diagnostic biomarker. As a result, only the FOSB and SPP1 genes are suitable for use as prognostic biomarkers of liver cancer, where the FOSB is a low-risk gene while the SPP1 is a high-risk gene. In other words, the survival rate of liver cancer patients with high expression of FOSB is higher than that of patients with low expression. In comparison, the survival rate of patients with high expression of SPP1 is lower than that of patients with low expression. This conclusion has been verified through literature. Tang C. et al. found that an overexpression of FOSB protein inhibited tumor cell proliferation, clone formation and cell migration [
27], while the silencing of FOSB protein expression promoted tumor cell proliferation, clone formation and cell migration [
28]. Li H.’s study also confirmed that the overexpression of FOSB protein can promote the proliferation of cancer cells. These studies confirmed that FOSB is a low-risk gene. Similarly, regarding SPP1, Lu C et al. found that the silencing of OPN protein (encode by SPP1 gene) in liver cancer reduced the number of cell clones and proliferation rate, and in vivo pharmacodynamics observed that the tumor volume of tumor-bearing mice decreased [
29]. It was confirmed that the SPP1 is a high-risk gene.
Finally, single-gene GSEA analysis was performed on the three prognostic genes, ESR1, SPP1 and FOSB, that affect the survival time of liver cancer patients (Fig.
8) in order to explore the mechanism affecting the prognosis of liver cancer patients. Through analysis, it was found that there were three pathways closely related to ESR1, FOBS and SPP1 genes (Fig.
8B a1, b1, c1), seven pathways closely related to ESR1 and SPP1 genes (Fig.
8B a2, c2), and four pathways closely related to ESR1 and FOSB genes (Fig.
8B a3, b3).
The three common pathways related to ESR1, FOBS, and SPP1 genes are HALLMARK MYC TARGETS V1, HALLMARK G2/M CHECKPOINT and HALLMARK E2F TARGETS. Among them, high expression of ESR1 and FOBS genes can activate these three pathways, while high expression of SPP1 gene inhibits these three pathways (Fig.
8a1, b1, c1). At the same time, since ESR1 and FOBS genes are low-risk factors, high expression of ESR1 and FOBS genes can activate these three pathways. SPP1 gene is a high-risk factor, high expression of SPP1 can inhibit these three pathways (Fig.
8 a, b, t). Hence, activation of these three pathways is conducive to improving the survival time of liver cancer patients. MYC TARGETS V1 pathway is a new anticancer target [
30‐
32] which is closely related to cell proliferation, differentiation and cell cycle. In contrast, the G2/M CHECKPOINT pathway [
33] and HALLMARK E2F TARGETS pathway are all closely related to the cell cycle [
34]. In summation, patients with liver cancer whose cell cycle pathway is activated have a better prognosis.
The seven common pathways related to ESR1 and SPP1 genes are HALLMARK PANCREAS BETA CELLS, HALLMARK ESTROGEN RESPONSE LATE, HALLMARK ADIPOGENESIS, HALLMARK FATTY ACID METABOLISM, HALLMARK BILE ACID METABOLISM, HALLMARK XENOBIOTIC METABOLISM and HALLMARK PEROXISOME. Among them, high ESR1 gene expression can activate the HALLMARK PANCREAS BETA CELLS and HALLMARK ESTROGEN RESPONSE LATE pathways, inhibit the five pathways of HALLMARK ADIPOGENESIS, HALLMARK FATTY ACID METABOLISM, HALLMARK BILE ACID METABOLISM, HALLMARK XENOBIOTIC METABOLISM and HALLMARK PEROXISOME. In contrast, SPP1 gene was opposite to ESR1 gene (Figure
8 a2, c2). Similarly, the ESR1 gene represents a low-risk-factor, SPP1 gene represent a high-risk factor and therefore liver cancer patients that show HALLMARK PANCREAS BETA CELLS and HALLMARK ESTROGEN RESPONSE LATE pathway activated and the HALLMARK ADIPOGENESIS, HALLMARK FATTY ACID METABOLISM, HALLMARK BILE ACID METABOLISM, HALLMARK XENOBIOTIC METABOLISM and HALLMARK PEROXISOME pathways inhibited have a better prognosis. By analyzing these pathways, it has been found that these seven pathways can be divided into four aspects in terms of function: 1. The prognosis of liver cancer patients with HALLMARK PANCREAS BETA CELLS pathway activated is better than that of liver cancer patients with this pathway inhibited. HALLMARK PANCREAS BETA CELLS pathway restrained and islet cell dysfunction are important cause of type 2 diabetes. This also means that patients with liver cancer complicated with type 2 diabetes have a poor prognosis. Patients with type 2 diabetes are also a high-risk population for developing liver cancer. This conclusion is consistent with the conclusion of an epidemiological investigation of liver cancer [
17]. 2. The prognosis of liver cancer patients that HALLMARK ESTROGEN RESPONSE LATE pathway activated is better. Clinically, “Palmar Erythema” and “spider nevus” appear in the palms of some patients with cancer [
35] and severe liver dysfunction [
36]. These manifestations are caused by the decreased metabolism of estrogen in the liver, resulting in excessive estrogen [
37] in the blood and stimulation of capillary arterial congestion and dilation. In other words, the presence of “Palmar Erythema” and “spider arachnoid” is a manifestation of the inhibition of estrogen pathway and the prognosis of liver cancer patients with “ Palmar Erythema “ and “ spider nevus “ is poor. Also, in clinical practice, some male liver cancer patients, due to the inhibition of estrogen metabolism, have an increase of estrogen level in their blood resulting in breast development. The prognosis of such liver cancer patients is not positive [
38]. 3. The prognosis is better in patients with liver cancer whose fat metabolism-related pathways (HALLMARK ADIPOGENESIS, HALLMARK FATTY ACID METABOLISM, HALLMARK BILE ACID METABOLISM and HALLMARK PEROXISOME) are inhibited. Epidemiological investigation shows that obesity is one of the important factors causing liver cancer and for the prognosis of liver cancer patients, the prognosis of patients with fat metabolism-related pathways being inhibited is better. 4. Patients whose HALLMARK XENOBIOTIC METABOLISM is inhibited have a more positive prognosis.
Four common pathways related to ESR1 and FOSB genes are activation of HALLMARK MYC TARGETS V2 and inhibition of HALLMARK HEME METABOLISM, HALLMARK COAGULATION and HALLMARK UV RESPONSE DN pathways. Both ESR1 and FOSB genes were low-risk factors, therefore patients whose HALLMARK MYC TARGETS V2 pathway was activated, and the HALLMARK HEME METABOLISM, HALLMARK COAGULATION and HALLMARK UV RESPONSE DN pathways were suppressed had a better prognosis. HALLMARK E2F TARGETS V2 pathway is closely related to the cell cycle, that is to say, the prognosis of liver cancer patients with activated cell cycle pathway is better, which is consistent with the conclusion previously arrived at. Also, HALLMARK HEME METABOLISM pathway regulates HEME METABOLISM, and the main product of HEME METABOLISM is bile pigment, which includes many compounds such as bilirubin, biliverdin, bilinogen and choline. Under normal circumstances, bile pigment is mainly excreted with bile. Bilirubin is the main pigment in bile, which is orange-yellow in color. The metabolic disorder of bilirubin is closely related to clinical hepatobiliary diseases. If the HALLMARK HEME METABOLISM pathway is activated, the heme will be massively metabolized into bilirubin, resulting in an excessively high concentration in plasma and then will be diffused into tissue, resulting in jaundice (easily seen in sclera, skin, etc.). According to the conclusions of the data analysis in this scientific work, patients with inhibited HALLMARK METABOLISM pathway have a good prognosis. In contrast, those with an activated HALLMARK METABOLISM pathway have a poor prognosis. After having activated HALLMARK METABOLISM pathway, patients will show jaundice related symptoms and liver cancer patients with jaundice have a poor prognosis whilepatients with suppressed HALLMARK COAGULATION pathway have a good prognosis, The HALLMARK COAGULATION pathway mainly regulates the COAGULATION function. Abnormal COAGULATION function in liver cancer patients is a common clinical symptom, mainly related to the lack of COAGULATION factor, thrombocytopenia and increased vascular permeability. The results of the data analysis in this work show that the prognosis of patients with inhibited blood clotting function is better than that of patients with this function activated.
Through a very detailed and painstkeing analysis, it was found that the prognosis of liver cancer patients is mainly related to the following functions: 1. It is closely related to the regulation of the cell cycle and patients with activated cell cycle have a good prognosis. 2. Liver cancer patients with activated HALLMARK PANCREAS BETA CELLS pathway have a good prognosis, while liver cancer patients with type 2 diabetes have a poor prognosis. 3. Patients with activated hepatocellular estrogen pathway have a good prognosis and those with “liver palm”, “spider nevus” and abnormal breast development have a poor prognosis. 4. Liver cancer patients whose fat metabolism-related pathways are inhibited have a good prognosis. 5. Liver cancer patients whose HALLMARK XENOBIOTIC METABOLISM pathway is inhibited have a good prognosis. 6. The prognosis of liver cancer patients is good if HALLMARK HEME METABOLISAM pathway is inhibited, and poor if the patient has “jaundice”. 7. Liver cancer patients whose HALLMARK COAGULATION pathway is inhibited have a good prognosis.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.