Background
Head and neck squamous cell carcinoma (HNSCC) is a solid malignancy that is the sixth most common human cancer, with an annual incidence of more than 600,000 [
1]. A combination of chemotherapy, radiotherapy, and adequate surgical resection has transformed HNSCC from a universally deadly disease to a potentially curable one; nevertheless, fewer than half of all patients are saved, with a 5-year survival rate < 50% [
2]. Traditional stratification schemes based on multiple clinicopathological parameters such as the American Joint Committee on Cancer (AJCC) TNM staging system have been recognized as the primary criteria providing prognostic guidance for the management of patients with HNSCC [
3,
4]. Despite the ease of its implementation and its wide application, TNM staging is insufficient for forecasting prognosis and estimation for subsets of HNSCC patients, and individual variation of survival times within the same stage is considerable [
5,
6]. Risk scores (RS) that capture such individual variation might guide better therapeutic strategies. An increasing body of evidence suggests that molecular risk assignments could be used to promote prognostic assessment and identification of potential high-risk HNSCC patients [
6‐
9].
Proteins that bind to specific DNA sequences and control the transcription rate of genetic information from DNA to mRNA, are called Transcription factors (TFs) [
7]. Their role is to regulate genes (turn on and off) and ensure expression in the required cells at the appropriate time and at required quantities. Increasing amounts of evidence suggest that deregulation of TFs characterizes the majority of human cancers, and some have been associated with cancer diagnosis and prognosis [
8,
9]. For example, p53 is a tumor suppressor protein, and mutations of this gene can be detected in more than half of all human cancers [
10]; c-Myc is another important oncogene that is overexpressed in some malignant cancer cells and has been associated with tumor progression and poor clinical outcome [
11]. Because of the significance of TFs in many biological processes and their aberrant activity in human cancer, we hypothesized that expression patterns of TFs may act as potential prognostic biomarkers of cancer.
The current cancer sample datasets which can be accessed via the TCGA and other similar resources, are an abundant data source which can assist in the identification of biomarker signatures and predict disease outcomes [
12,
13]. In our study, an extensive evaluation of the RNA-seq data across a 502 HNSCC patient cohort was carried out with the help of available TCGA datasets. Using a univariate survival analysis (USA) and multivariate Cox stepwise regression (MCSR) algorithm, we identified six prognosis-related TFs. Based on their expression in the TCGA series, a prognostic model was built and validated in another independent series (GSE41613 and GSE65858). Further MCRA and stratified analysis was used to confirm if the multi-TF signature was an independent indicator of HNSCC. Our investigation will put forward new insights in methods of overall survival (OS) prediction in patients suffering from HNSCC.
Discussion
Currently, one of the main parameters to help clinicians determine patient outcomes and plan treatments, is the TNM staging; nevertheless, variation in outcomes suggests that clinical features cannot fully account for phenotypes of different potential subtypes [
3,
4,
16]. Oncogenesis is characterized by several stages that need modifications in gene expression programs [
17]. TFs play important roles in controlling this. Therefore their dysregulation is a reason for the acquisition of tumor-associated properties [
18]. Previous studies [
19,
20] reported that the expression patterns of TFs may be an effective means of grading tumor subtypes. However, to date, expression profiles based on TFs in HNSCC have not been clarified.
Our study was aimed at identifying a TF expression signature that could predict outcomes for HNSCC patients at individual levels. To this end, we evaluated the prognostic significance of all differentially expressed TFs in HNSCC that were chosen on the basis of USA of the RNA-seq data retrieved from TCGA. Unfortunately, the requirement to measure a number of genes, reduces the efficiency of prognostic biomarkers in clinical applications [
21]. Therefore, using an MCSR algorithm, a multi-TF signature was identified. This was more effective than individual TFs as predictive potential was maximized while the number of predictors were reduced [
14,
15,
21,
22]. The results of MCSR suggested to us that we should construct a model consisting of six TFs that forecast the survival time of HNSCC patients.
Among these TFs, HOXA1 was previously reported as an oncogene in HNSCC. Upregulation of HOXA1 promoted the migration and invasion of HNSCC cells via the EMT pathway. More importantly, high levels of HOXA1 were discovered to be linked with poor prognosis of HNSCC [
23]. This finding accorded with our results. Another candidate HOXB8, similarly to HOXA1, was a member of HOX family that was found to be significantly linked with tumor metastasis and shorter overall survival in many human cancers [
24‐
26]. Further investigation revealed that HOXB8 was a predictor of the effects of FOLFOX4 chemotherapy in metastatic colorectal cancer [
27]. Therefore, we hypothesized that HOXB8 may act as an oncogene in HNSCC progression; further investigation of this hypothesis is needed. Aberrant expression of ZNF662 caused by epigenetic changes via DNA hypermethylation was a valuable biomarker of tumorigenesis and advanced HNSCC [
28]. In our study, ZNF662 was expressed at low levels in HNSCC and was associated with shortened survival. Down-regulation of MEIS1 modulated the leukemic cell response to chemotherapeutic-induced apoptosis [
29]. Additionally, LHX1 was reported as a driver gene of clear cell renal cell carcinoma proliferation, apoptosis, and promoting tumor growth [
30]. In the present study, the up regulation of LHX1 was an indicator of poor prognosis of HNSCC. This suggests that MEIS1 may participate in the regulation of chemoresistance in HNSCC and may be potential targets for anti-HNSCC drugs in the future. A recent study showed that ZBTB32 facilitated transcriptional repressor Zpo2 targeting to the GATA3 promoter to downregulate GATA3 expression and activity. Modulation of GATA3 by ZBTB32 in turn caused the development of aggressive breast cancers [
31]. In our study, loss of ZBTB32 was associated with shortened survival time in HNSCC.
Taken together, the Kaplan–Meier analyses and ROC analyses demonstrated that expression of these TFs was a powerful predictor prognosis of HNSCC, suggesting its potential research value in the context of HNSCC.
Previous simulations have shown that the prognostic models which are significantly linked with survival times in the training data set can also be developed when using entirely independent dataset [
32]. In this study, the usefulness of this multi-TF signature was validated in the non-overlapping cohort in GSE41613 and GSE65858, indicating good reproducibility of this multi-TF signature in HNSCC.
Multivariate analysis showed that PNI and ENE were independent clinicopathological factors for predicting the risk of HNSCC. Perineural growth is an unusual means of tumor cells growth that is not least resistance; it indicated high risk of postoperative recurrence and was an important poor prognosis factor in HNSCC [
33]. ENE was defined as tumor cells infiltrating extranodal tissues beyond the capsule of affected lymph nodes. It was a characteristic of more aggressive cancer and was associated with shortened survival [
34]. In stratified analysis, we found that the multi-TF signature remained a powerful forecaster of prognosis within these subsets, suggesting that our multi-TF was independent of these important clinicopathological parameters. This result implied that our multi-TF signature has the potential ability to enhance clinical prognostic tests. This will assist in improving patient stratification and treatment planning accordingly in future trials.
As with all research, our study also has its limitations. For one, due limited data, out of the thousands of known and predicted TFs, we could only obtain 1639 gene expression profiles. In addition, some clinical information was incomplete, which made our study susceptible to the inherent biases. Finally, while GSEA was used to investigate biological processes associated with identified TFs, further studies are required to investigate their specific role in cancer.
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.