Introduction
Extranodal NK/T-cell lymphoma (ENKTL) is a subtype of non-Hodgkin lymphoma characterized by progressive lesions in nasal cavities, the middle of the face, upper aerodigestive tracts and other non-nasal sites. The disease frequently occurs in Asian and Latin Americans [
1]. The infection of Epstein-Barrvirus (EBV) may be closely related to its pathogenesis [
2]. At an early stage of ENKTL, the combination of chemotherapy and radiotherapy prolongs patients’ survival and improves the quality of life [
3,
4]. However, for advanced refractory ENKTL patients, the efficacy of current treatment is not satisfactory [
5]. Immunotherapy provides a new direction for these patients [
6,
7]. Immunotherapy for programmed cell death protein 1 (PD-1) and programmed cell death protein ligand 1 (PD-L1) has enormously improved the therapeutic effect of ENKTL [
8,
9]. Searching for tumor-specific genes is beneficial for ENKTL diagnosis, the discovery of tumor-specific neoantigens and the development of novel therapeutic strategies. These tumor-specific genes can be used as predictors of the prognosis. Nevertheless, the genetic landscape and the mutation signature of ENKTL remain to be elucidated.
By understanding the existence of the tumor-assocoated unique genes, we could enrich therapeutic methods to improve the prognosis. A good illustration is epidermal growth factor receptor (EGFR)/anaplastic lymphoma kinase (ALK) in lung cancer [
10], CD19 in diffuse large B cell lymphoma [
11] and HER2 in breast cancer [
12]. Recently, gene detection has been a predictor for prognosis and treatment sensitivity of cancer patients. As for ENKTL, gene expression profiling (GEP) identified unique signatures which are mainly from neoplastic NK cells. Cytotoxic-molecule (granzyme H) levels and the activity of ENKTL signaling pathways (NF-κB and JAK/STAT3) are both elevated [
8,
13]. Some gamma delta-peripheral T cell lymphomas (γδ-PTCLs) have STAT3 mutations [
14]. Except for the above features, a genetic investigation found 6q21 deletion and PRDM1 as a candidate gene in NK cell-related malignancies. PRDM1 locates at the minimal common region (MCR). The methylation of PRDM1 inhibits PRDM1 expression [
15]. When treated with decitabine, NK cells would experience toxicity by enhancing PRDM1 levels [
16]. Therefore, The methylation of PRDM1 maybe exists in ENKTL. HACE1 is another gene located within the 6q21 region. The loss of HACE1 function is realized by the deletion and hypermethylation of cytosine phosphate guanine island. The abnormal HACE1 within 6q21 is a cause of NK cell lymphomagenesis [
17].
Machine learning algorithms are now involved in numerous aspects of medical studies, which integrate AI tools into clinical practice. As for medicine, ML is a scientific tool to analyze large-scale data appropriately [
18,
19]. It fosters us to understand cancer comprehensively from molecular perspectives, especially its cancer-diagnosis application [
20‐
22]. Therefore, ML is valuable to find out valuable biomarkers in multiple data. In ML, support-vector machines (SVMs) are significant learning models with algorithms for classification and regression analysis. They can select biomarkers that are the most effective classification [
23,
24].
Our study aims at identifying gene signatures in patients with extranodal NK/T-cell lymphoma. Initially, we detected genes from a pair of twins with ENKTL and analyzed unique differential genes. Based on these genes, we analyzed ENKTL patients’ information in several databases to predict specific antigen mutations and new targets. We hope to understand the genetic background and to seek for targets to predict prognosis. Therefore, the understanding of ENKTL’s genetic background would benefit us enormously.
Discussion
ENKTL can be easily diagnosed by morphology, immunohistochemical markers and in situ hybridization. Currently, there is no standard ENKTL guideline for prevention and treatment and no retrospective study with large samples. Previous retrospective studies indicated that the therapeutic effect of advanced and recurent ENKTL is unsatisfactory. In multiple studies, corresponding prognostic factors are inconsistent [
33‐
35]. Also, there is no prognostic molecular marker that is applied in clinical practice. Therefore, it is imperative to seek ENKTL biomarkers for treatment and prognosis. We hope that these biomarkers could accurately evaluate the prognosis of patients, promote targeted therapy in ENKTL and develop individualized treatment plans.
Several methods are used to build linear regression models. Each method is suitable for a given dataset with different features. The response variable (
n) and the predictive variable (
p) reflect the bias of these linear regression models. Our study consists of 38 samples. Elastic networks and SVM were used to screen specific target genes from unique variants to distinguish tumors from normal samples [
36]. Elastic networks are suitable for our data that independent variables are much less than dependent variables (
n < <
p). We screened out 11 gene expression signatures for prediction. These are CDC27, MOV10L1, CROCC, RP1L1, ZNF141, FCGR2C, NES, CCDC9, TPSD1, CACNA1I, BMP8A.
With algorithms, scientists have applied machine learning to predict diagnosis, prognosis and therapeutic efficacy in lymphoma [
37‐
39]. For example, Hyungsoon et al. developed an automated device for the molecular diagnosis of aggressive lymphomas. They validated nodal lesions suspicious for lymphoma in 40 patients. The device can be portable to classify benign and malignant tumors [
37]. Moreover, Shipp et al. applied supervised learning to identify cured diseases and fatal/refractory diseases. Specifically, the algorithm classified patients with different five-year survival rates and prognostic indexes (IPI) into two groups for outcome prediction, respectively [
38]. Besides, Julkunen et al. constructed a machine learning framework (comboFM) to predict the responses of drug combinations. They found synergistic action in the combination of an anaplastic lymphoma kinase inhibitor (crizotinib) and a proteasome inhibitor (bortezomib) in lymphoma [
39]. The performance stability of these models could be further compensated by choosing the study population, classifying pathological type and enlarging sample size.
Importantly, our data is from a pair of identical twins. One is diagnosed with ENKTL, while the other is healthy. We collected a cancerous sample from the ENKTL patient and a non-cancerous sample from the healthy one. We screened out unique mutant genes from the cancerous patient by setting the healthy one as control, which suggests that some of these mutant genes might be potential pathogenic genes. Our result is more convincing to explain the alterations in ENKTL pathogenesis. Next, our study performed an elastic analysis of ENKTL patients from international multi-platforms with SVMs for improved accuracy. Compared with linear mixed effect models (NONMEMs) and neural network models, SVMs solve problems better, including model selection, over-learning, nonlinear and dimension disaster and local minimum. According to the limited sample information, SVMs can find the best compromise between the complexity and learning ability of the model to obtain the best generalization. The method enables our predictive models appliable in predicting ENKTL.
Mechanically, the tumorigenesis and invasion of ENKTL are complicated. We comprehensively analyzed the molecular network by using GO and KEGG enrichment analysis. The purpose is to elucidate the pathogenesis of ENKTL and find sites for targeted therapy. Through the functional enrichment of unique variant genes, we understand the biological processes of these genes in ENKTL. Figure
3A indicated that extracellular exosomes were significantly correlated with ENKTL. A study showed similar results that the upregulated exosomal miRNA was a biomarker to identify ENKTL patients with treatment failure [
35]. Exosomal miRNAs might be a biomarker to indicate therapeutic efficacy. Besides, we found that Golgi membrane, clathrin−coated endocytic vesicle membrane, transport vesicle membrane, endoplasmic reticulum membrane were all participated in the development of ENKTL, according to Fig.
3A. Latent membrane protein 1 (LMP1) is a stimulant of NKTL progression, which upregulates eukaryotic translation initiation factor 4E (eIF4E) mediated by the NF-κB pathway [
40]. We hypothesized that these membrane-related mechanisms are involved in the activation of the tumorgenesis pathway, serving as an indicator of tumor progression. Other immunological signals (T cell receptor signaling pathway and phosphatidylcholine /phosphatidylserine-translocating ATPase activity) and complexes (MHC class II) are involved in ENKTL. A study identified the expression of T-cell receptors in ENKTL and the re-arrangement of T-cell-receptor genes [
41]. The inhibition of ATPase activity and the regulation of MHC class II might be potential sites for targeted therapy.
Additionally, several eregulated cellular signaling networks have been extensively investigated in ENKTL. Janus kinase/signal transducer and activator of transcription (JAK/STAT) pathway is the first representative. Compared with normal NK cells, proteins in the JAK/STAT pathway are differentially expressed in ENKTL cells [
13,
42]. Platelet-derived growth factor receptor-α (PDGFR-α) pathway is another activated pathway in ENKTL and is correlated with cellular biological functions. Huang et al. used a tyrosine kinase inhibitor (imatinib mesylate) to inhibit the growth of the PDGFRα-overexpressing ENKTL cell line (MEC04) [
13]. NOTCH-1 signaling pathway involves Notch 1 and Notch 2 which synergistically regulate the differentiation and function of NKT cells [
43]. Similarly, Huang et al. used two NOTCH-1 inhibitors to hinder NK cell growth [
13]. Figure
5A indicated that these potential pathways are related to antigen processing and the Fc epsilon RI-mediated signaling pathway. Stimulatory antigens might be processed for presentation. Precessed antigens could bind to the extracellular domain of the α chain of Fc epsilon RI to initiate intracellular signals. Furthermore, our results show the involvement of metabolic pathways, lysosomal pathways and Toll-like receptor pathways. JAK/STAT pathway, PDGFR-α pathway and NOTCH-1 participate in the energy metabolism and lysosomal activities. Our findings are consistent with the previous study.
We depicted the landscape of ENKTL and identified a series of targetable genes. Among them, CDC27 (Cell division cycle 27), ZNF 141 (Zinc finger protein141), Fc gamma receptor 2C (FCGR2C) and NES (nestin) are four promising candidates. Both the upregulation of ZNF141, FCGR2C and NES and the downregulation of CDC27 were associated with robust dendritic cell (DC) and T cell infiltration. Our deduction may be that ENKTL-associated proteins can be processed by DCs and presented to CD8
+ T cells in the event of adequate other kinds of T cell infiltration to induce an immune attack. On the one hand, we analyzed these candidates functionally by GO enrichment analysis, KEGG enrichment analysis, GSEA and GSVA. On the other hand, their potential function in tumors was also investigated in previous literature. First, CDC27 is a significant subunit responsible for promoting anaphase. High levels of CDC27 were witnessed in T-lymphoblastic lymphoma (T-LBL). It facilitated proliferation, G1/S transition, protein upregulation (cyclin D1, CDK4 and PD-L1) and the inhibition of apoptosis [
44]. Next, ZNF 141 encodes gene mapping and is related to chromosomal aneusomy syndromes. Its defect causes developmental disorders, involving some transcriptional regulators. Chromosomal aneusomy is one of the common genetic features of malignant tumor cells. Fetal death is a common outcome of chromosomal aneusomy [
45]. Then, FCGR2C correlates with Fc gamma receptors of low-affinity immunoglobulins. It is a transmembrane glycoprotein located on the surface of immune cells and participates in phagocytosis and clearance of immune complexes [
46]. NES is a kind of intermediate filament protein which is used as a marker of neural stem cells and progenitor cells in the central nervous system and a marker of endothelial cells. As for cancer, nestin exists in cancer stem-like cells and poorly differentiated cancer cells [
47].
While our study was the first large-scale data analysis focusing gene signatures in patients with ENKTL, several limitations were noticed. We obtained a number of NKTL’s unique variant genes from the sequencing data of a pair of twins. Due to the limited number of samples, we selected the training set and validation sets of ENKTL from the public library to explore the predictive efficacy of these unique variant genes for ENKTL. We hope to find out a set of the most important signature genes for ENKTL. First, we conducted WGS, instead of detecting the mRNA level of these genes. Hence, the transcriptional level of gene expression is lack of validation in twins. Second, in multiple platforms, analyzing large cohort results in batch effects which are caused by different time, operators, reagents and instruments. Finally, a limited number of patients is another limitation. Our patients are a pair of twins. The best identification results need more data for validation and confirmation.
Conclusion
We conducted WGS for sequencing to identify unique variant genes from the peripheral blood samples of an ENKTL patient and a healthy individual. By analyzing the database, we demonstrated CDC27, MOV10L1, CROCC, RP1L1, ZNF141, FCGR2C, NES, CCDC9, TPSD1, CACNA1I, BMP8A as unique genes of ENKTL. Their involvement of biological activity and immune filtration was associated with ENKTL tumorigenesis and progression. ENKTL was caused by antigen processing/presentation pathway, Fc epsilon RI signaling pathway, glyoxylate and dicarboxylate metabolism pathway, lysosome pathway and Toll-like receptor signaling pathway. Finally, our study concluded that ZNF141, FCGR2C, NES and CDC27 are promising ENKTL gene signatures. These four genes showed good predictive efficacy in the validation set, suggesting that they are convincing signature genes for ENKTL.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.