Background
Age of diagnosis may be particularly important in pediatric cancers due to the significant developmental changes that occur in humans from birth to age 18. Put another way, a few years in the life of a child represents a substantial percentage change in overall lifespan, not the case in later stages of adulthood.
Several studies have indicated that age of onset for pediatric neuroblastoma (NBL) and pediatric acute lymphoblastic leukemia (ALL) is reflective of disease course. Since the 1970s, a survival difference in pediatric NBL has been noted between older and younger diagnostic age, with a diagnostic age of 12 months or older reflective of significantly poorer survival [
1]. In a 2005 study, pediatric NBL patients diagnosed between ages 12 and 18 months were found to have a higher 6-year event-free survival ate than those diagnosed later in life [
2]. A study in 2011 found similar results, suggesting that while the impact of diagnostic age on prognosis has decreased since it was first detected in the 1970s, it remains a strong indication of survival rates for pediatric NBL patients [
1].
This stark difference in survival rates based on age at diagnosis would suggest developmental, gene expression differences representing possibly unknown or as yet uncharacterized subdivisions of NBL, and indeed, certain gene expression related distinctions have been associated with pediatric NBL progression and prognoses distinctions. Lack of amplification of the MYCN gene, in addition to general hyperploidy, have been found to represent improved prognoses for pediatric NBL patients ages 12 to 18 months [
3]. ATRX mutations have also been found to be increased in pediatric NBL patients with an older age at diagnosis, suggesting that expression of the wild-type version of this gene contributes to survival in patients diagnosed at a younger age [
4].
Pediatric ALL also indicates elevated survival rates for patients diagnosed at a younger age. A 2014 study indicated that survival of pediatric ALL patients decreased with age at diagnosis, excluding those diagnosed within the first year of life, where there was the worst prognosis [
5]. While the mutations of several genes have been correlated with survival in this cancer, none have also been assessed in a context of age at diagnosis.
Keeping in mind that many gene expression scenarios, particularly associated with development [
6,
7], involve a gradient of expression and signal pathway activation gradients, and that signal pathway activation gradients have also been reported to represent distinct outputs in the cancer setting [
8‐
10], we took an approach to biomarker discovery for NBL and ALL that emphasized a continuum of expression levels, with the expectation that, for certain genes, the higher the expression level, the greater the probability of a discreet effect, in this case a discreet effect leading to a survival distinction. Thus, in this study, we used RNA expression data from the TARGET database to first identify genes whereby a continuum of expression could be established as having a correlation with age, and then to additionally, independently filter such genes for an association of expression levels with distinct survival rates.
Discussion
The above data provided two basic indications. First, the upregulation or downregulation of a particular set of genes associated with a continuum of age can be used as a starting point to identify gene expression levels associated with survival rates, in this case where the survival rates are, in turn, associated with patient age. The approach above (Fig.
2) provides new candidate biomarkers of survival, and new candidate mediators of tumor development, based on an approach that represents a continuum of expression levels with the presumption (not directly addressed here) that such a continuum would reflect probabilistic impacts on cellular or physiological events impacting survival. From this base of candidates, further filters were applied to identify and validate the gene expression-level, survival associations. This approach represents an important, distinct starting point, in comparison to many common approaches to identifying biomarkers, and drivers of tumorigenesis, motivated by evidence that indicates that amplification of signaling pathways, rather than potential on/off switches, can ultimately have highly discreet phenotypic results, not only in tumorigenesis [
10,
14] but in normal development [
6,
7]. Unlike a starting point for many survival biomarkers, the empirical approach, such as transfection of an oncoprotein and assaying increased tissue culture cell division, may not be possible for certain biomarkers or facilitators of tumorigenesis. And indeed, as discussed further below, several of the genes outputted above have little previous connection to tumorigenesis, perhaps genes not easily identified in empirical approaches that require essentially, but unnaturally, on/off switches in signaling or other effects for a detectable output. Other paradigms, with a component of continuity and correlation, in the absence of empirical approaches have revealed similar successes, for example, the correlation of mutation burdens with cancer immune responses and responses to immunotherapy [
15‐
18]; and the correlation of mutation burden in haematopoietic stem cells with subsequent development of acute myeloid leukemia [
19].
Second, the data above are consistent with anomalies impacting large regions of single chromosomes, i.e., chromosomes, 11 and 17 in pediatric NBL; and the X chromosome in pediatric ALL.
In terms of the functional impacts of potential tumor drivers, or the expression of proteins that might limit tumorigenesis, it does need to be kept in mind that age of diagnosis can represent a lot of variation in terms of age of onset of the tumor, which would presumably start with one tumorigenic cell at an undetermined age. Nevertheless, correlative studies that indicate a value of gene expression level assessments based on age do likely provide at a minimum new prognoses biomarker opportunities and new candidates for assessing specific tumor functions.
As for the two genes upregulated with lower survival, USP17L5 represents an apparent, relatively poorly studied member of a family of ubiquitin peptidases; and SLC25A5 represents a carrier for ADP to the mitochondria, and a carrier of ATP from the mitochondria to the cytoplasm [
20]. The ubiquitin peptidases, including the USP17 sub-family, have been variously associated with cancer progression and cancer growth inhibition (and apoptosis), apparently dependent on the type of cancer [
21‐
23] or other factors not yet fully appreciated. SLC25A5 specifically has been reported to be down-regulated with metastasis in hepatocellular carcinoma [
24], with no information available for NBL. As in the case of ubiquitin peptidases, as a family, the solute carrier proteins have a complicated association with cancer progression, or lack of cancer progression, dependent on very specific situations.
As for the four genes that are upregulated with youth and better NBL survival, only RND3 has a detailed research history with cancer. That cancer history is contradictory, as with other genes, with reports indicating a potential for high RND3 expression representing both pro- and anti-cancer results [
25‐
27]. A recent review regarding RND3 specifically evaluated the pro- and anti-cancer functions and concluded that indeed, the overall impact of RND3 is context dependent [
28]. Mutation of SLC12A1 has been associated with a short survival in NBL [
29]. POF1B has no known, previous connection to NBL and little connection to cancer in general.
Pediatric ALL also reflects decreased survival with older age of diagnosis, although this correlation has not been extensively investigated [
5]. We found that the pediatric ALL patients in the TARGET data set had lower survival with higher diagnostic age, confirming this risk factor for this dataset (Fig.
6). Employing the above discussed paradigm (for NBL), the upregulation of 3 genes was found to be associated with poor survival and high diagnostic age in pediatric ALL (Additional file
1: Figure S1), none of which have any previous connection to cancer; and the upregulation of 17 genes was found to be associated with high survival and low diagnostic age in this cancer (Additional file
1: Figure S1).
Of the 17 genes that, when upregulated, were associated with high survival and low diagnostic age in pediatric ALL patients, only ZNF81 is located on the X chromosome, discussed below. Of the other 16 genes, COL5A1, GABBR1, HACE1, EPHA7, and TRIP11 have well-documented associations with cancer. Inhibition of GABBR1 (gamma-amino-butyric acid type B receptor 1) has been associated with progression of colorectal cancer, whereas overexpression of this gene served as an inhibitor of miRNAs that would otherwise lead to proliferation of this cancer [
30]. It is possible that this gene serves a similar role when upregulated in younger, higher-surviving pediatric ALL patients. EPHA7 may also be sequestering a microRNA, namely miR-944, which, when expressed at a high level, has been shown to facilitate proliferation of non-small cell lung cancer cells. Thus, high levels of EPHA7 may have the effect of sequestering microRNAs and reducing proliferation in other cancers [
31]. HACE1 is an E3 ligase downregulated in several cancers, including gastric cancer and breast cancer, and was found to inhibit the Wnt/β-catenin pathway, thereby playing a role in suppressing tumorigenesis [
32,
33]. The pathway involving TRIP11 and triiodothyronine is necessary for localization of TRIP11 to the nucleus and was found to be disrupted in renal cell cancer, leading to progression [
34]. Finally, COL5A1 has been found to have associations with gastric cancer, non-small cell lung cancer, and renal cancer [
35‐
37]. Overall, these overlapping, previous studies are consistent with the upregulation of these genes in the younger patients and in the longer surviving patients. Additional gene ontology information for both NBL and ALL is provided in Additional file
1: Table S13.
While the lack of an opportunity to confirm newly identified biomarkers consistently and firmly with either a pro-cancer or anti-cancer phenotype based on a history of gene expression functions in other cancers can be limiting, it is in fact the expectation, based on decades of previous research. First, as noted in specific cases above, there are disparities of gene expression function related to context. Second, it is clear that many cancer hallmarks are dependent on signal pathway amplification rather than a molecular on/off switch. This is exemplified by feed forward apoptosis, whereby transcription factors that activate pro-proliferative genes, such as histone genes, also activate apoptosis-effector genes, i.e., when these transcription factors are expressed at high levels [
8‐
10,
38‐
42]. Third, even outside of the cancer setting, different tissues can have opposite functions for the same signaling pathway; FGFR3 activating mutations stimulate spermatocyte cell division but inhibit chondrocyte cell division, leading to achondroplasia [
43,
44].
In pediatric ALL, there was a disproportionate increase in the number of genes expressed at a higher level in younger, longer surviving patients located on the X chromosome. There is very little in the literature regarding X chromosome loss and worse ALL survival or X chromosome gain and better survival. However, there has been one report with a small amount of data indicating loss of X chromosome in older patients with presumably poorer survival rates but where specific, relative survival data was lacking [
45]. As for pediatric NBL and chromosomes 17 and 11, our data clearly indicated an overrepresentation of genes on these two chromosomes that were upregulated with better survival, suggesting chromosome loss in older, worse surviving patients. Again, there are no data now available regarding chromosome copy number variations (CNV) in very young NBL patients, the subject of this study. (For example, the youngest 20% of NBL patients in this study were all diagnosed under 1 year of age.) However, there have been reports of worse survival among older cohorts of patients with loss of 11q [
46,
47]. The above data do not distinguish between CNV of either chromosome 11 or 17, respectively, versus loss or gain of heterochromatic regions that would affect gene expression. However, the previous reports of loss of chromosome 11q and poorer survival are consistent with chromosome loss in poorer surviving patients. 17q gain in NBL has been linked to lower survival in older patients. This is an apparent contradiction, however, these 17q data do not represent a significant overlap of our data, due to the lack 17q information for the younger patients in this study.