Background
The Cancer/Testis Antigens (CTAs) are a group of tumor-associated proteins that are typically expressed in normal male germ cells but are silent in normal somatic cells. However, they are aberrantly expressed in several types of cancers [
1,
2]. Because of this unique expression pattern, the CTAs are considered attractive targets for cancer biomarkers and immunotherapy [
3].
Broadly speaking, the CTAs can be divided into two groups: the CTX antigens that are encoded by the X chromosome and the non-X CT antigens that are encoded by the autosomes. To date, 228 CTAs have been identified of which 120 CTAs (52%) map to the X chromosome (the CTX antigens) while the remaining (non-X CT antigens), are distributed on the 22 autosomes and the Y chromosome [
4]. Interestingly, while some autosomes that are gene-poor such as chromosome 21 (only 425 genes), are enriched for CTA genes (1.6 CTAs/100 genes), others, that are gene-rich, such as chromosome 1 (3380 genes) and 7 (1764 genes), are very CTA-poor with only 0.3 CTAs and 0.06 CTAs/100 genes, respectively. However, among the sex chromosomes, while only 1 CTA is present on the Y chromosome, there are 7.5 CTAs/100 genes on the X chromosome – a 125-fold increase over chromosome 7 [
4].
Furthermore, the CTX antigens are comprised of large gene families of closely related members and are frequently associated with advanced disease with poorer prognosis [
5‐
10]. It is remarkable that as much as 10% of the genes on the X chromosome are estimated to belong to CT-X families [
11]. Although the role of many of these tumor-associated antigens in the disease process remains unclear, emerging evidence indicates that they appear to function in several important cellular processes such as transcriptional regulation, signal transduction, and cell growth [
3]. Some also appear to function as putative proto-oncogenes [
12,
13] and are associated with maintaining the undifferentiated state of stem cells [
14‐
17].
More recently, a majority of the CTAs, especially the CTX antigens, were predicted to be intrinsically disordered proteins or IDPs [
4]. IDPs are proteins that lack a rigid structure at least
in vitro. Despite the lack of structure, most IDPs can transition from disorder to order upon binding to biological targets and often promote highly promiscuous interactions. Thus, IDPs play important roles in transcriptional regulation and signaling via regulatory protein networks and are frequently over-expressed in pathological conditions such as cancer [
18,
19]. Consistent with these observations, several CTAs are predicted to bind to DNA and their forced expression appears to increase cell growth implying a potential dosage-sensitive function [
4]. Taken together, these observations provide a novel perspective on the CTAs implicating them in processing and transducing information in altered physiological states in a dosage-sensitive manner. Thus, understanding how the CTAs are selectively derepressed in cancer is an important question in cancer biology.
Although the mechanism promoting their derepression is not entirely clear, it is widely held that DNA methylation is one of the central mechanisms responsible for gene silencing [
20‐
22]. For example, De Smet et al. have observed selective and genome-wide hypomethylation of MAGE-A1, one of the most studied CTAs in cancer cells, coincided with its activation [
23‐
25]. Several other studies have also reported a similar trend in other CTA genes [
26‐
30]. Roman-Gomez et al. discovered direct correlation between the methylation levels of the HAGE gene and its expression in myeloid leukemia [
29]. Similarly, Cho et al. observed expression of the CAGE gene and its promoter hypomethylation in gastric cancer [
27]. Yegnasubramanian et al. found that although the CT-X antigens undergo DNA hypomethylation and overexpression in primary prostate cancers, these changes were more pronounced in metastatic disease when many CT-X antigens were highly upregulated coincident with poorer prognosis [
30]. Consistent with this hypothesis, other studies have shown that inhibiting DNA methyltransferase (DNMT) activity with 5 aza-deoxycytidine (5 AZA) results in robust somatic expression of a set of CTAs both
in vitro and
in vivo[
31]. However, only a few studies have experimentally confirmed promoter demethylation following DNMT inhibition by 5AZA or silencing by siRNA [
13,
32] and in many cases CTA genes that lack CpG dinucleotides respond to DNMT inhibition while in other cases, despite the presence of CpG dinucleotides, the CTA genes are not derepressed. For instance, the
SPANX genes, which lack a CpG island in the promoter region [
33], respond robustly to 5 AZA treatments [
34] implicating an indirect mechanisms underlying the response, although the presence of such sites at distal regions or within introns cannot be ruled out. It is therefore unclear to what extent these effects are mediated directly by promoter demethylation of the target genes as opposed to being indirectly driven by demethylation in conjunction with transcription factors that activate CTA expression.
Thus, it is obvious that our understanding of CTA gene regulation and mechanisms underlying their abrupt derepression in cancer has been not subjected to a genome-wide analysis to assess their generality. Such a genome-wide analysis has recently become possible due to availability of genome-scale methylation arrays and other related technologies. Here, using genome-scale methylation profiles of promoter CpG methylation in 501 samples that included 305 normal sperm and somatic cells, and 196 cancers, we employed a new metric to identify gene promoters that follow an expected pattern of CTA promoter methylation, i.e., promoters that are unmethylated in sperm, methylated in somatic tissues, but unmethylated in cancer tissues. The higher the metric value for a gene promoter, the more closely it follows the prototypical methylation pattern (PMP). Our analysis confirmed that CTA gene promoters broadly follow a PMP. At the genome level, we observed that PMP promoters tend to cluster together on the genome and the CTA genes appeared to strongly associate with such clusters. Furthermore, we discovered that the binding sites for CTCF, the generic ‘insulator protein’, demarcate the regions of PMP. Genomic regions with PMPs have been observed to be enriched for genes involved in defense response, immune response, and cytokine-cytokine receptor interaction [
35,
36]. We also found that a large fraction of CTAs genes, especially the ones associated with clusters of PMPs coincided with the nuclear lamina-associated domains (LADs). However, we did not observe any significant differences in the above hypomethylation patterns between the promoters with CpG islands (CGI) and promoters without CpG islands (non-CGI). Taken together, our results indicate that PMP is a broad phenomenon covering CTAs and that their derepression is significantly explained by previously observed broad domains of hypomethylation in cancer that are associated with LADs.
Discussion
Even amongst the so-called tissue-specific genes, the CTAs exhibit a remarkable expression pattern. While typically expressed only in the sperm and repressed in normal somatic tissues, they are aberrantly derepressed in most cancers [
1]. However, neither the mechanism nor the functional consequence of this atypical expression pattern is entirely clear for most, if not all, CTAs. While there is evidence to suggest that promoter demethylation might be a major driver of derepression of CTA expression in cancer [
3], this mechanism does not appear to be universally applicable to all CTAs [
34]. Independent of CTA-related investigations, it has been shown that large genomic regions are hypomethylated in some cancers [
41,
43]. It is therefore tempting to speculate that CTAs may be swept by the global hypomethylation as bystanders leading to their derepression. Based on a systematic analysis of DNA methylation profiling data from various tissues, our results support this thesis.
We found specific hypomethylation of the CTA and CTX promoters in the testis and cancer cells. More specifically, we observed hypomethylation of MAGE, XAGE, PAGE, and GAGE promoters in cancer samples (Additional file
3: Figure S2) confirming several studies that have reported that the activation of these genes in cancer is strongly correlated with promoter demethylation [
23,
44,
45]. This result, combined with well-established association between DNA methylation and gene silencing suggests methylation as the predominant mechanism for CTA derepression in cancers. Moreover, the loci with PMP including the ones in CTA and CTX promoters, cluster on the genome and are associated with LAD regions. This is consistent with broad regions of hypomethylation in cancers that are also associated with LAD regions [
41]. Taken together, these findings suggest that hypomethylation and derepression of CTA and CTX genes in cancers are part of a broader phenomenon and may not depend on a specific mechanism. We also found that the broad tendencies revealed by our analyses are independent of CpG islands. This may suggest a non-specific mechanism underlying methylation-mediated derepression of CTA genes. Furthermore, we also found that CTCF sites are linked to a sudden change in methylation patterns for both CGIs and non-CGI loci. This is consistent with a previous study that found epigenetic silencing of tumor suppressor genes in the absence of CTCF binding [
46].
We note that the methylation profiling platform (Illumina HumanMethylation27 BeadChip) used in this study includes only ~1 CpG locus per gene promoter, resulting in a small number loci corresponding to CTA and CTX genes. Furthermore, a single CpG site may not be representative of an entire promoter in all cases. Although Illumina has a newer and denser methylation chip (Illumina HumanMethylation450 BeadChip) which contains more than 450,000 methylation sites, the number of relevant tissues for which such data exists is currently insufficient. In addition, although previous works have illustrated an inverse correlation between promoter methylation and gene expression level, we could not ascertain this for our data because none of the samples included in the study had a corresponding expression data available.
Competing interests
The authors have no competing interests to declare.
Authors’ contribution
The study was conceived by P.K. and S.H. All analysis was done by R.K. The manuscript was written jointly by all authors. All authors read and approved the final manuscript.