Introduction
Diffuse intrinsic pontine glioma and malignant midline gliomas have the worst prognosis of all types of malignant tumors in children and adolescents [
3,
4,
10]. The nosological shift in the 2016 WHO classification now based on both phenotype and genotype has redefined the family tree of diffuse gliomas [
17]. Glial tumors are now grouped according to their driver mutation, e.g.
IDH1 mutation, and their astrocytic or oligodendroglial phenotypes which are often associated with additional specific genetic alterations such as
ATRX mutations or 1p/19q co-deletion, respectively. The discovery of recurrent mutations in the histone H3 genes in pediatric high-grade glioma has definitively separated these gliomas from the ones seen in adults [
21,
26]. While G34R/V mutations in the
H3F3A gene are exclusively found in the hemispheres, K27M/I mutations in several histone H3 variants genes are specific to midline tumors [
23]. The 2016 release of the WHO classification has therefore created a new entity to describe these latter tumors as diffuse midline glioma, H3K27M mutant, irrespective of their specific location along the midline.
In pediatric brain tumors, location has however long been seen as a master driver of oncogenesis that could reflect their different cells of origin [
8,
9]. Whether the oncogenic driver mutation is overriding location as a crucial determinant of oncogenesis is therefore to be examined since biologic identity of all these tumors would call for a common therapeutic framework. There is however no reported data showing at once a similar biology and outcome of diffuse midline gliomas (DMG) irrespective of their location in the presence of a histone H3-K27M mutation.
Moreover, we have shown two distinct forms of diffuse intrinsic pontine gliomas according to the type of histone H3 gene mutated,
H3F3A versus HIST1H3B, with respect to differentiation markers, oncogenic programs, response to therapy and evolution [
1,
2]. These mutations are mutually exclusive either because their effect is redundant [
16] leading to a global loss of H3K27me3 repressive mark, or because they cannot transform the same cell, suggesting the idea of distinct cells of origin.
The purpose of this work was therefore to better characterize a large series of pediatric midline high grade gliomas from the (epi)genomic, transcriptomic and anatomic point of view in order to identify the respective influences of these parameters on their biology described by their gene expression, methylome, and clinical behaviour.
Moreover, we compared the H3-K27me3 landscape between the two main subgroups of DIPG, H3.1-K27M and H3.3-K27M, in patient deriving cellular models.
Materials & methods
Central pathology review
High-grade glioma cases were reviewed centrally to confirm the diagnosis according to the 2007 WHO classification and its 2016 update as previously described [
10,
20].
Specific immunostainings were performed to detect nuclear expression of the trimethylation mark at position K27 of the histone 3 tail (1:1000, polyclonal rabbit antibody, Diagenode, Belgium) as well as nuclear expression of the K27M form of histone H3 (1:1000, polyclonal rabbit antibody, Millipore, CA).
Derivation and culture of glioma stem-like cells (GSCs)
GSCs were derived from DIPG tumors at diagnosis as previously described [
19]. Briefly, tumor cells were mechanically dissociated from biopsies within 24 h of surgery, and further cultured as an adherent monolayer in laminin-coated flask (Sigma) in neural stem cells medium consisting of NeuroCult NS-A proliferation medium (Stemcell technologies) supplemented with heparin (2 μg/mL, Stemcell technologies), human-basic FGF (20 ng/ml, Peprotech), human-EGF (20 ng/ml, Peprotech), PDGF-AA (10 ng/ml, Peprotech), and PDGF-BB (10 ng/ml, Perprotech). Medium was renewed every other day, and passaging performed when cells reached 80% confluence using Accutase (Thermo).
Case selection for overall survival analysis and gene expression profiling by microarray
Frozen tissue samples were obtained from 119 pediatric patients with brain tumors of WHO grade III and IV (all locations, below 18 years old). The samples were collected at Necker Hospital (Paris, France). Complete follow-up information was available for 82.5% of patients (
n = 99). Histone H3 gene mutational status was determined by Sanger sequencing for
H3F3A,
HIST1H3B/C and
HIST2H3C [
2]. The distribution of samples in the distinct genotype subgroups and location are detailed in Table
1.
Table 1
Contingency table of samples used for microarray gene expression profiling and overall survival analysis
Cortex | 6 | 0 | 0 | 35 | 41 |
Pons | 0 | 13 | 26 | 6 | 45 |
Non-thalamic midline | 0 | 0 | 2 | 12 | 14 |
Thalamic midline | 0 | 0 | 12 | 7 | 19 |
Case and sample selections for methylation analysis
Eighty primary tumor samples were selected for methylome analysis: 22 among the DIPG patient cohort collected in Necker Hospital; 15 from the HERBY trial [
10], all the remaining samples were collected by the Heidelberg group. The distribution of samples in the distinct genotype subgroups and their location are detailed in Table
2.
Table 2
Contingency table of samples used for methylation profiling
Cortex | 10 | 0 | 0 | 0 | 10 | 10 | 30 |
Pons | 0 | 12 | 19 | 1 | 0 | 0 | 32 |
Thalamic midline | 0 | 1 | 17 | 0 | 0 | 0 | 18 |
Gene expression profiling was also conducted by either microarray or RNA sequencing for 5 of these tumors. Eight glioma stem-like cell (GSC) cultures derived from patient biopsies at diagnosis and matching primary tumors were analyzed similarly [
19].
Methylation profiling
DNA was extracted from tumors and genome-wide DNA methylation analysis was performed using either the Illumina HumanMethylation450 BeadChip (450 k) or EPIC arrays. DNA methylation analysis was performed with custom approaches as previously described [
12,
23]. DNA methylation profiles from 50 K27M pHGG were compared to defined supratentorial tumor subgroups, i.e. G34R-H3.3 mutated (
n = 10),
MYCN (
n = 10) and
PDGFRA/pedRTK1 (
n = 10) subgroup tumors. For t-SNE analysis (t-Distributed Stochastic Neighbor Embedding, Rtsne package version 0.11), 428,230 uniquely mapping autosomal probes in common between the 450 k and EPIC arrays were used. The input for the t-SNE calculation is 1-Pearson correlation, weighted by variance. Clustering analyses were performed using the beta values of the top 10,000 most variably methylated probes by standard deviation. Methylation probes in the heatmap representation were reordered by unsupervised hierarchical clustering using Pearson correlation distance and median linkage.
Microarray gene expression profiling
Gene expression analysis was conducted on an Agilent platform as previously described [
2] but using RUV4 correction of batch effects [
7] implemented in the R package ruv. GE data from DIPG were collected from one of our previous study [
2] and microarray analysis was performed for 75 additional pHGG tumors located outside the brainstem. PCA, k-means and t-SNE analysis were performed using the same parameters as for RNA-seq data on the probes associated with the highest standard deviation. One hundred and twenty genes accounting for 0.79% of the entire probeset were selected.
RNA-seq gene expression profiling
RNA-seq was performed on 21 primary tumor samples. Libraries were prepared using the TruSeq stranded mRNA sample preparation kit according to the supplier recommendations and paired-end sequencing was conducted on Illumina NextSeq500 to generate a mean of 150 million reads of 75 base pairs by sample. Trimmed reads were then mapped using tophat2 (v2.1.0) and bowtie2 (v2.2.5) first to the reference transcriptome, then to the reference genome for the remaining reads. Genes with a row sum of raw counts over the studied samples equal to or below 10 were filtered out to remove non-expressed genes. We handled outliers as default using minReplicatesForReplace = 7 in DESeq() function used to estimate size factors, dispersion and model coefficients. Distances between samples were computed by using ‘1-Pearson correlation coefficient’ as the distance measure. PCA and t-SNE analysis were performed on the 250 genes associated with the highest variance to keep the same proportion of genes selected with the microarray analysis. All samples were projected on the two first principal components computed with rlog transformation of the counts of the 120 genes with the highest standard deviation. Using Rtsne package (v 0.11), we applied t-SNE on the same data matrix with the Pearson correlation as a distance and the following parameters: theta = 0, perplexity = min(floor((ncol(rlog_VariableGenes)-1)/3), 30), check_duplicates = FALSE, pca = FALSE, max_iter = 10,000, verbose = TRUE, is_distance = TRUE.
RNA-seq was also performed on 6 distinct GSC models using TruSeq stranded total RNA sample preparation kit according to the supplier recommendations (Illumina) and then processed similarly as primary tumors.
Histone ChIP-sequencing and data processing
ChIP-seq of H3K27me3 epigenetic modification was performed in 6 GSC models at Active Motif according to proprietary methods. The 75-nt sequence reads were generated on a Illumina NextSeq 500 platform, mapped using BWA algorithm and peak calling was performed using SICER1.1 algorithm [
27] with cutoff FDR 1e-10 and gap parameter of 600 bp. False positive ChIP-seq peaks were removed as defined within the ENCODE blacklist [
5]. Overlapping intervals between the different samples were merged, and the average number of normalized reads in the different samples were calculated for these 16,977 genomic intervals defined as ‘bound regions’. These bound regions were separated for further analysis in overlapping or not overlapping gene loci using Genecode annotation (gencode.v19.chr_patch_hapl_scaff_annotation.gtf). PCA for all samples were generated after scaling to unit variance using the PCA function from the FactoMineR package (v1.41) and plotted using Factoextra (v1.0.5).
Merging of the 3 biological replicates of H3.1- or H3.3-K27M subgroups was performed using bigWigMerge tool (UCSC kent utils,
http://hgdownload.soe.ucsc.edu/admin/exe/macOSX.x86_64/). Heatmaps of H3K27me3 ChIPseq enrichment across genomic loci were calculated using deepTools version 1.5.11. ComputeMatrix was used with regions of either +/− 5 kb or +/− 10 kb around the center of the genomic intervals for ‘bound regions’ or TSS for differentially expressed genes, respectively. Heatmaps were plotted with or without k-means (
k = 5) and their average profiles of ChIP-Seq enrichment in the same − 10/+ 10 kb genomic intervals were also generated for each k-means group. The bigWig files (all signal) were annotated with chipSeeker package using UCSC hg19 known gene annotation and visualizated by peakAnno and Vennpie.
Survival curve comparisons
The distribution of overall survival (OS) was calculated according to the Kaplan-Meier method and all survival function estimate comparisons were performed in PRISM software using a log-rank test. OS was calculated from the date of histo-radiological diagnosis until death of patient from disease or last contact for patients who were still alive.
Discussion and conclusions
The recent update of the WHO classification aggregated DIPG and infiltrating glial neoplasms of the midline presenting a H3-K27M substitution as a new entity: diffuse midline glioma (DMG) H3 K27M-mutant. But this implied that the histone H3-K27M would be a stronger driver of oncogenesis than location. We used a pHGG cohort at diagnosis to evaluate the similarity of these H3-K27M mutated tumors at both DNA methylation and gene expression levels and compared them to other pHGG tumor subgroups in order to support or question the new update of the WHO classification [
23].
Tumor classification based on microarray gene expression profiling revealed that K27M mutated tumors, either thalamic or pontine, can be discriminated from all others. Consequently, the molecular subtype appears to influence more the gene expression profile than the infratentorial
vs. supratentorial location of the tumor in the brain. Alternatively, location may not be considered at the structural level (i.e. brainstem
vs. thalamus) but rather at the embryological level (midline
vs. hemispheres) thus unifying midline tumors. Accordingly, survival analyses highlighted a similarly poor prognosis for DIPG and thalamic tumors, either mutated or not for histone H3. The bad outcome of all midline gliomas with K27M mutations was also observed by Karreman et al. [
13]
. Taken together these data support the rationale to define the same treatment paradigms for both midline K27M tumors and DIPG.
The stratification based on DNA methylation profiling of our pHGG population also supports the similarity between thalamic and pontine H3-K27M tumors. Our results are concordant with previous reports concerning the discrimination of G34R/V and K27M mutated tumors depending on DNA methylation [
18,
23]. Moreover, t-SNE analysis highlighted a clear distinction of H3-K27M tumors from all other pHGG subtypes. Indeed, G34 mutated tumors,
PDGFRA and
MYCN subtypes represent three homogenous groups distinct from K27M tumors.
The DIPG median survival was similar to the large retrospective pHGG cohort recently analyzed by MacKay and collaborators [
18]. However, midline and hemispheric tumors were associated with longer median survival in our cohort, 18
versus 13.5 months and 30.5
versus 18 months, respectively. Survival analyses also pointed out a significantly better outcome of histone H3 wild-type non-thalamic midline tumors, which likely reflects that they may be less diffusely growing gliomas and could therefore be more amenable to surgical resection, or that they exhibit a behavior of low-grade gliomas.
Finally, in the gene expression analysis some diffuse midline gliomas without any H3-K27M mutation are grouped with the H3K27M tumors. Interestingly, they all exhibit a loss of the H3K27me3 mark as well. Thus, defining the entity by the H3K27M mutation only may therefore be too restrictive. Further studies are needed to sort this issue, especially since diffuse pontine and thalamic malignant gliomas have a poor prognosis irrespective of the presence of an H3K27M mutation or not as also recently shown in the HERBY trial (Mackay et al., Cancer Cell 2018).
Interestingly, our methylation profiling data showed a subclassification of DMG, H3 K27M-mutant into two subgroups according to the histone gene affected by the K27M substitution, i.e.
H3F3A or
HIST1H3B/C. The sole H3.2-K27M sample clustered together with H3.1-K27M tumors, as expected given that they are both canonical histone H3 with identical role in the cell [
24]. Yet, the similarity of H3.1 and H3.2 mutated tumors should be confirmed with additional H3.2 mutated samples from other cohorts, as only two were reported in the literature [
2,
18]. Histone H3.1 and H3.3-K27M tumors were also discriminated by RNAseq transcriptome profiling, supporting their intrinsic divergence. This could support the recently reported superiority of RNAseq over expression microarrays for tumor classification purposes [
28]. MacKay
and coll. did not report this distinction between H3.1 and H3.3 mutated tumors using DNA methylation profiling. This difference might result from a 10 times smaller proportion of H3.1 mutated samples analyzed (8 out of 441 samples) hiding out the variability brought by these tumors in their huge dataset. In addition, we used a 7 times larger set of probes (10,000 instead of 1381) that might have captured more variations in the overall pHGG DNA methylation landscape.
It is assumed -and was recently demonstrated by Hoadley et al., that DNA methylation can reflect the epigenetic memory of cancer cell-of-origin [
11]. Indeed, DNA methylation is inherited through successive division and is shown to be not only tumor-type specific, but can also reflect the cell type and differentiation state of the transformed cells [
6]. The clear separation by DNA methylation profiling of H3.1-K27M from H3.3-K27M tumors may support that these tumors would arise from distinct cells of origin or at distinct differentiation steps in the lineage. This strongly corroborates our previous results showing that DIPG can be divided in two main H3.1-K27M and H3.3-K27M tumor subgroups, associated with distinct histological and molecular phenotypes, age of onset and location along the midline, H3.1-K27M mutation being almost exclusively seen in the brainstem while H3.3-K27M mutation are distributed everywhere along the midline [
2]. Also, the conservation of DNA methylation discrepancies in GSCs confirm they are intrinsic characteristic of the tumor cells as opposed to the peri-tumor stroma. Furthermore, we demonstrate that despite the same global biochemical consequence of the H3K27M driver mutation, significant differences exist in the H3K27me3 landscape relying on the type of histone H3 variant affected (i.e. H3.1 or H3.3) as shown by PCA.
As a whole, the distribution of the H3K27me3 marks along the genome is different, both at the quantitative and qualitative levels. Average level of trimethylation at K27 is similar in both subtypes since only a small number of loci are highly enriched in H3K27me3 in H3.1 K27M mutated tumors, whereas the majority of the regions presenting this epigenetic mark are associated with a higher signal in H3.3-K27M. These H3K27me3 variations among the two subgroups are associated with the modulations of gene expression, many more genes being repressed in H3.3-K27M tumors. Qualitatively, K-means clustering of the distribution of this mark identified 5 clusters of genic regions and 5 clusters of intergenic regions differentially trimethylated at position K27 in the two subgroups of DIPG. We show that among differentially expressed genes, levels of H3K27me3 are anti-correlated with gene expression in general. However, gene expression could not be strictly explained by the levels of H3K27me3 in all cases leaving the possibility of additional levels of regulation for gene expression in DIPG.
Overall, we provide molecular and clinical evidence in favor of the unification of all midline K27M mutated tumors that was proposed in the 2016 WHO CNS classification based on their common driver mutation. As such, these gliomas need to be considered as a unique entity in future clinical trials. Further analyses on the biology of H3.3 and H3.1 mutated diffuse gliomas are required to explain the distinction we have reported so far; this could allow testing specific precision medicine approaches in these two subgroups of diffuse midline gliomas.
Acknowledgements
DC, CP, TK, JM, EB, JG, MAD acknowledge financial support from Etoile de Martin and Carrefour through the campaign “Les Boucles du Coeur”, DC from Cancéropôle Île-de-France “Émergence 2018”, and SP from Association pour la Recherche en Neurochirurgie Pédiatrique. CJ acknowledges NHS funding to the NIHR Biomedical Research Centre at The Royal Marsden Hospital and the ICR.