Background
Osteosarcoma (OS) is a common malignant bone cancer in adolescents around the whole world and has a high tendency of metastasis [
1]. The considerable progress in OS prevention and treatment has been made by introducing promising therapeutic strategies such as postoperative neo-adjuvant chemotherapy and multi-agent systemic chemotherapy over the past decades [
2,
3]. However, the statistical evidence suggested that the incidence and mortality rates of OS have been continuously growing by approximately 1.4% each year [
4]. The recent studies demonstrated that the 5-year survival rate of OS remains about 65% and more than half of OS patients die from OS metastasis [
5,
6]. Therefore, the identification of novel prognostic gene markers for metastasis of OS is imperative for improving the overall survival of OS patients.
The advent of next sequencing technologies allows rapid disease detection and diagnosis in recent decades. Accordingly, extensive studies based on the microarray data and transcriptome sequencing data were carried out to identify potential gene drivers involved in the occurrence, metastasis and recurrence of tumors. Notably, existing evidence has showed that numerous gene signatures had significant prognostic values for OS. Wang et al argued that OS patients with a high
ALDH1B1 level had an unfavorable clinical outcome compared to those OS patients with low
ALDH1B1 level, implying that this gene might be a potential prognostic marker for patients with OS [
7]. Shi et al examined the expression difference and prognostic power of
DDX10 based on a dataset from Gene Expression Omnibus (GEO) database. They found that there was a higher expression level of
DDX10 in OS tissues than normal tissues and increased
DDX10 level was related to a poor prognosis [
8]. Furthermore, a previous research analyzed three miRNA expression profiles and constructed a support vetor mechine (SVM) classifier with 15 differentially expressed miRNAs (DEmiRNAs). The results showed that this classifier had a relatively high accuracy to predict OS recurrence, suggesting that these DEmiRNAs were possibly associated with OS prognosis [
9]. Liu et al recently screened a four-pseudogene signature for OS survival prediction based on the RNA sequencing data and this pseudogene panel could clearly differentiate high and low risk patients with OS [
10]. Although previous studies have identified many gene makers in the development and recurrence, a deeper understanding of the influence of gene signatures on the survival prognosis of OS needs further investigating.
In the present research, the differentially expressed genes (DEGs) between metastatic and non-metastatic OS samples were identified from the training dataset obtained from The Cancer Genome Atlas (TCGA) database. Then, the prognostic genes were screened followed by optimized selection based on SVM recursive feature elimination (SVM-RFE) algorithm. The optimal prognostic genes were used to construct a SVM classifier to separate OS metastatic and non-metastatic OS samples. Additionally, a machine learning analysis (univariate and multivariate cox regression) was used to extract independent OS prognostic genes to construct a risk score (RS) model. The independent clinical prognostic indicator was identified followed by a predictive nomogram construction. Finally, the functional analyses of prognosis-related genes were also performed. Our findings will promote the understanding of clinical prognostic outcomes of OS patients.
Discussion
Existing evidence has demonstrated that it is crucial to identify several key gene makers related to OS survival prognosis, which provides important theoretical references for developing promising therapeutic strategies for OS treatment. Herein, we established a SVM-based classifier to distinguish metastatic OS samples and non-metastatic OS samples. Moreover, eight independent prognostic genes were identified to construct a RS model. Meanwhile, tumor metastasis and RS model status were found to serve as independent prognostic factors for OS survival. Additionally, the functional analyses of prognosis-related genes reveled that they were significantly enriched in the GO-BP term of immune responses and cytokine-cytokine receptor interaction pathway.
Tumor metastasis is a leading cause of high mortality rates of various tumors. In recent decades, the high-throughput sequencing technologies have greatly facilitated the understanding of metastasis-related genes function by decoding the genome of cancer patients [
26]. An increasing number of researchers have also concentrated on exploring the underlying pathogenesis of OS metastasis [
27,
28]. Moreover, building a prediction model of OS metastasis was growingly important for prognostication and clinical decision-making. A SVM, which could effectively distinguish entities into different classes in analyzing microarray data, was frequently used in constructing sample classification model due to its high accuracy and flexibility for modeling multisource data [
29]. He et al established a SVM classifier using 64 feature genes for OS and this classifier differentiated metastatic OS samples from non-metastatic OS samples in the dataset GSE21257 with a prediction accuracy of 100% [
30]. Herein, we constructed a SVM classifier based on 45 prognosis-related genes to discriminate OS metastasis samples and non-metastasis samples. The performance evaluation analysis revealed that this classifier had high precision with AUC of 0.969, sensitivity of 0.915 and specificity of 0.884. Moreover, these results generated in training set were also verified by a SVM classification in validation set GSE21257. The SVM classifier also exhibited a good performance with AUC of 0.907, sensitivity of 0.857 and specificity of 0.778. These findings implied that 45 prognostic genes might be key biomarkers to identify metastatic and non-metastatic OS patients.
The correlations between these prognosis-related genes and clinical survival were investigated by a multivariate logistic regression analysis. The results showed that there were eight independent prognostic risk genes for OS, which consisted of
KCNJ15,
SLC24A4,
ASPA,
REM1,
SCARA5,
LANCL3,
CPA6 and
TRH. A RS model was also constructed to divide OS patients into high and low-risk groups. Consequently, this eight-gene signature exhibited a good performance to differentiate metastatic OS patients from non-metastatic OS patients.
KCNJ15 (also known as
KIR4.2), belongs to a member of the
KIR4 subfamily and encodes the potassium channel. Liu et al recently reported the silencing of
KCNJ15 played key roles in tumor malignance and was related to unfavourable prognosis renal carcinoma [
31]. However, whether
KCNJ15 directly involves in OS progression has not been clarified. Notably, multiple investigations have demonstrated that some potassium ion channels such as hSlo potassium channel in OS cells were implicated with the carcinogenesis [
32,
33]. Therefore, the potential roles of
KCNJ15 in pathogenesis of OS need to be investigated in future.
SLC24A4/NCKX4 is located at on chromosome region 14q32 and a member of potassium-dependent sodium-calcium exchanger gene family [
34]. No reports are concerned about the relationship of this gene and OS development.
REM1 encodes a GTPase and participates regulating the activity of voltage-dependent Ca
2+ channels. Numerous studies have suggested that
SCARA5, a member of the scavenger receptor family, is involved in the molecular mechanisms of various cancers such as hepatocellular carcinoma [
35]. You et al previously found that
SCARA5 served as a key biomarker for the development and metastasis of breast cancer [
36]. An early research reported that
CPA6 was remarkably up-regulated in early stage samples with oral squamous cell carcinoma (OSCC) compared with those in late stage, suggesting that this gene might have crucial diagnostic values for OSCC [
37]. Another study emphasized that the methylation of
TRH could classify OSCC and oropharyngeal SCC patients from healthy individuals with a high accuracy [
38]. Unfortunately, the association of eight independent prognostic genes and OS has not been unraveled until now. Further investigations are essential to understand the underlying role of these genes for the diagnosis and prediction of OS.
Additionally, the DEGs between high and low risk groups in training dataset were further extracted to understand the influence of this eight-gene signature on OS prognosis prediction. There were 614 significantly DEGs (117 up-regulated and 497 down-regulated genes). The GO-BP analysis indicated that these genes were mainly responsible for immune responses. Mori et al argued that up-regulation of the immune response was a critical characteristic in patients with tumors and several immunotherapies were potent approaches for those patients undergoing OS metastasis [
39]. Moreover, a wealth of evidence has suggested that the involvement of immunotherapies could regulate the tumor microenvironment and re-activate prolonged immune responses [
40,
41]. The results of KEGG enrichment analysis showed that these genes were predominately associated with cytokine-cytokine receptor interaction pathway. Similarly, Chen et al performed a bioinformatics analysis based on a circRNA microarray dataset and three gene expression profiles of OS cell lines. They found that those down-regulated DEGs from three gene profiles mainly played prominent roles in cytokine-cytokine receptor interaction pathway [
42]. These findings revealed that the initiation of immune responses and cytokine-cytokine receptor interaction possibly contributed to the OS progression.
In the current study, we also found tumor metastasis and RS model status acted as independent prognostic factors for OS survival by the cox regression model analysis. Then, these two survival-related factors as variables were incorporated into the nomogram and the results indicated that RS model status showed the biggest influence on OS survival prognosis. The nomogram is a powerful risk assessment tool for a wide variety of diseases including OS, which provides important guidance for clinical outcomes prediction, therapy selection and follow-up care [
43]. We noted that overall survival rates the 3- and 5-year for OS patients were similar to the actual observation for OS patients, implying that tumor metastasis and RS model status were vital clinical characteristics for survival prediction of OS patients.
Although an eight-gene panel and two independent prognostic factors have been identified to be associated with OS prognosis, the detailed pathologic mechanisms have not been elucidated. For example, whether these gene signatures are involved in several molecular pathways such as cytokine-cytokine receptor interaction pathway still needs to be illuminated. Moreover, a further accurate classification with a large sample size and clinical information is necessary to distinguish OS metastasis and non-metastasis patients. In addition, the external validation is not carried out to check the reliability of our nomogram. Meanwhile, the performance evaluation of nomogram established here also requires to be performed. Finally, the corresponding experimental research is also needed to verify the biological functions of key gene.
In conclusion, we constructed a SVM-based classifier to separate metastatic and non-metastatic patients. Moreover, the eight-gene signature and two independent prognostic factors (tumor metastasis and RS model status) were closely related to OS survival. These findings greatly improved the understandings of OS metastasis and prognosis. However, relevant validation studies and optimization of prognostic model for OS will be considered in future.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.