Introduction
Thymoma arises from the thymic epithelial cells and accounts for approximately 20% of anterior mediastinal tumors, making this the most common tumor. The annual incidence of thymoma is approximately 0.13/100,000, with the onset age ranging from 30 to 50 years (Strobel et al.
2010; Girard et al.
2009). Research has shown that there is no significant difference in the incidence of thymoma when compared between men and women (Panarese et al.
2014; Shelly et al.
2011). The pathogenesis of thymoma remains unclear, even though a few of the reported cases have suggested that this disease might be related to Epstein–Barr virus or type I T-cell virus infections (Okumura et al.
2002; Lee
2009; Ciardiello and Tortora
2008; Cappuzzo et al.
2005; Erkmen et al.
2011).
The prognosis of patients with thymoma is largely determined by the thymoma histological type, which is complex. This complexity has led to the present lack of uniform measurement standards. The classification of thymoma by the World Health Organization (WHO) is largely based on the proportion of lymphocytes in the thymus tumor. WHO classification type A (medullary type) and type B1 are considered to be less invasive, while type B3 is considered to be more invasive (Cappuzzo et al.
2005; Henley et al.
2004; Gumustas et al.
2013). The criteria for Masaoka staging are based on whether the thymic tumor envelope is infiltrated or whether there is metastasis into other parts of the body. Studies have shown a good correlation between Masaoka staging and the WHO pathological classification, and these classification systems have been widely used in clinical practice as an independent predictor of patient prognosis and survival. Type A (medullary type) and AB (mixed type) thymoma in the WHO classification correspond to Masaoka stages I and II due to less local infiltration. Type B (cortex) thymoma is associated with frequent invasion and metastasis and corresponds to Masaoka stages III and IV (Henley et al.
2002; Luo et al.
2016).
Currently, surgical resection is still the primary treatment for thymoma. Radiation and chemotherapy are effective supplemental treatments (Attaran et al.
2012). However, the treatment of thymus tumors composed of different tissue types still lacks effective guidelines, partly due to histological diversity and partly because of its low incidence.
With recent developments in molecular biology, an increasing number of research projects have begun to investigate methods to accurately predict the prognosis of thymoma, and to develop more effective treatment methods. Studies have shown that the increased expression of tumor-associated genes, such as FPGS (folylpolyglutamate synthase)/GGH (gamma-glutamyl hydrolase) and VEGF (vascular endothelial growth factor), are related to the degree of malignancy in thymic carcinoma and B3 thymoma.
11, 15 Studies have also shown that c-kit expression is significantly lower in thymic adenoma compared to thymic carcinoma; C-kit expression was detectable in approximately 70–86% of patients with thymic carcinoma, compared to only 0–5% of thymic adenomas. Thus, c-kit could be used as a specific biomarker for thymic carcinoma, as well as a potential target for TKI (tyrosine kinase inhibitor) treatment (Henley et al.
2004; Badve et al.
2012). Several studies have also investigated the epigenetics of thymoma genes, including histone modification, chromatin recombination, and gene methylation (Gumustas et al.
2013; Luo et al.
2016; Badve et al.
2012).
Due to the lack of standardized risk evaluation criteria for patients with thymoma with which to predict prognosis, our study used mRNA-seq datasets from The Cancer Genome Atlas (TCGA). A gene signature for the prognosis of patients with thymoma was then constructed by univariate significance analysis of gene expression and Cox regression survival analysis. The final regression model represents a potentially useful tool but needs to be verified in clinical practice. Nevertheless, our findings will help to elucidate the pathogenesis of thymoma in the future.
Discussion
Thymoma is a tumor formed by the abnormal differentiation of thymic epithelial cells; the highest incidence of this abnormal differentiation occurs in the anterior mediastinum. The incidence of thymoma is approximately 0.13/100,000 and mostly affects patients aged 30–50 years old, without gender difference (Strobel et al.
2010; Girard et al.
2009; Panarese et al.
2014). The pathogenesis of thymoma remains unclear, although some studies have suggested that it might be associated with certain viral infections. However, the low incidence of thymoma has limited the development of large-scale clinical trials and in-depth basic research (Okumura et al.
2002; Lee
2009; Kelly
2013; Riely and Huang
2010).
The histological features of thymus are variable. In the WHO classification, the proportion of lymphocytes in thymic tissues generally increases gradually from type A to type B3. The stepwise change in lymphocyte composition is known to be associated with an increase in the degree of tumor malignancy (Henley et al.
2002; Marx et al.
2014). Studies have also shown a good correlation between Masaoka staging and WHO classification; types A and AB in the WHO classification correspond to stages I and II of the Masaoka classification while type B in the WHO classification corresponds to Masaoka stages III and IV. The 10-year survival rate of patients with type A thymoma can reach 100%. Type C, also known as thymic carcinoma, has a median survival duration of 24–49 months and a 5-year survival rate of 30–50% (Henley et al.
2002; Sasaki et al.
2002). However, at present, there are still no unified evaluation standards for thymus tumors with different histological features.
With recent developments in molecular biology, an increasing number of techniques are been applied to investigate tumor genome, such as transcriptome analysis, gene mutation detection and epigenetic analysis (Yao et al.
2017; Mao et al.
1998). Based on TCGA database mining, our present study integrated and analyzed thymoma datasets to identify DEGs in tumor tissues. We then identified a seven-gene signature and risk score by univariate and multivariate Cox regression analysis; risk score was significantly correlated with OS. Consequently, this system could predict the OS rate of patients with thymoma in an effective manner. Furthermore, our seven-gene signature may be useful in elucidating the pathogenesis of thymus tumors. Our results represent the first description of the genetic signature of thymus tumors.
Within the identified gene signature, all seven genes were significantly associated with the prognosis of thymoma. Of these, LIPE, FBLN2, and KLF4 were found to be protective factors. LIPE hydrolyzes stored triglycerides to produce free fatty acids and can also convert cholesterol esters into free cholesterol to produce steroid hormones. The mRNA levels of LIPE have been shown to be significantly increased in the adipose tissue of cancer patients (Gumustas et al.
2013; Thompson et al.
1993). FBLN2 is located on chromosome 3p25.1; the binding of FBLN2 to fibronectin and other ligands depends on calcium. This protein can also act as an adapter to coordinate interactions between FBN1 and ELN. FBLN2 is associated with the occurrence and development of tumors by interaction with extracellular matrix (ECM) proteins (Law et al.
2012; Falkson et al.
2009). KLF4 is an activated or repressible transcription factor that regulates the expression of key transcription factors during embryonic development and plays an important role in maintaining embryonic stem cells and preventing their differentiation (Kelly
2013; Zhang et al.
2010).
We also investigated the molecular function of the remaining risk factors. ABCA10 is made up of 40 exons that are commonly expressed in the heart, brain, and gastrointestinal tract. The expression of ABCA10 is inhibited by the introduction of cholesterol into macrophages, indicating that it is a cholesterol-response gene and involved in the maintenance of macrophage lipids in a steady state (Wenzel et al.
2003). The DGAT2 gene is located on chromosome 11q13 and, as a key enzyme in fat metabolism, is a candidate obesity gene. The DGAT2 protein is a glycolipid metabolism component involved in the triglyceride biosynthesis pathway (Friedel et al.
2007). SLC16A7 (also known as monocarboxylate transporter protein 2, MCT2) is mainly located in the peroxisomes of prostate cancer cells and interacts with Pex19 via the peroxidase transport mechanism. The overexpression of SLC16A7 in malignant cells, compared to non-malignant cells, is directly related to peroxidase localization. SLC16A7 is significantly overexpressed in malignant prostrate tumors, suggesting that it may represent a biomarker of prostate cancer (Valenca et al.
2015). SELENBP1 was previously identified as the most significantly downregulated protein in ovarian cancer cells by membrane proteomics analysis. Selenium can interfere with the androgen pathway, which is regulated by the expression of SELENBP1. In malignant ovarian cancer, changes in SELENBP1 expression are useful indicators of abnormal selenium/androgen pathways, and such changes may reveal prognostic information relating to ovarian cancer (Huang et al.
2006).
Our present study has several limitations which need to be taken into consideration. Firstly, we identified our seven-gene signature based solely on bioinformatic analysis. Further experiments are now needed to verify these results. Secondly, in addition to the TCGA datasets, it is necessary to mine other databases to provide further validation. Thirdly, the seven-gene signature model identified in this study needs to be closely integrated with clinical practice. To improve these deficiencies, we plan to collect thymic tumor samples, and clinical prognosis information, and then use specific experiments to verify our results. It is now necessary to conduct more studies on the functional effects of these seven genes and their relevance to patient survival; this is important because we have only limited functional knowledge of the seven genes in the signature identified.