Introduction
Glioblastoma multiforme (GBM) is the most common malignant primary brain tumor in adults. An estimated 77,670 new cases of primary CNS tumors are expected to be diagnosed in the United States in 2016 [
1]. Of these, 24,790 will be diagnosed as malignant [
1]. Although the incidence of primary brain tumors is low compared to other cancer types, primary brain tumors give rise to a disproportionate amount of morbidity and mortality, often robbing patients of basic and critical functions such as movement and speech [
2]. The median survival of newly diagnosed patients is only 12–15 months, making it one of the most devastating types of cancers [
3]. In fact, the five-year survival rate for primary malignant brain and central nervous system tumors is the sixth lowest among all types of cancers after pancreatic, liver & intrahepatic bile duct, lung, esophageal, and stomach [
2]. Unfortunately, despite substantial investigations into disease mechanisms and at least some advances in currently available treatment options, the outcomes for GBM patients remain dismal [
3].
Although an association between human cytomegalovirus (HCMV) and GBM was first observed in 2002 [
4], there is still a high degree of discordance in the literature regarding the detection of viral agents in CNS tumors [
4‐
28]. These discrepancies have been attributed to a number of issues including the use of different cohorts, differences in sensitivities of different PCR assays for low levels of viral gene expression, and the exquisite sensitivities of assays such as IHC to slight differences in experimental conditions.
In an attempt to remedy the high degree of discordance regarding the detection of HCMV in CNS tumors, an HCMV and glioma symposium was convened in Washington, DC on April 17, 2011. At the conclusion of this workshop, a summary paper was published reporting the consensus position in 4 major areas: 1) the existence of HCMV in gliomas, 2) the role of HCMV in gliomas, 3) HCMV as a therapeutic target, and 4) key future investigative directions [
29]. Based on the evidence presented at the workshop, it was concluded that HCMV sequences and viral gene expression exist in many malignant gliomas and that in vitro studies support the idea that HCMV can modulate key signaling pathways in glioblastomas [
29].
Next generation sequencing (NGS) has the ability to globally interrogate the genetic composition of biological samples in an unbiased manner and with relatively high sensitivity. Applying this technology to pathogen discovery has already shown promise, resulting in the discovery of a novel Merkel cell polyomarvirus in Merkel cell carcinoma [
30], for example. In our laboratory, we have utilized NGS technology in the interrogation of Epstein-Barr virus (EBV) in diffuse large B-cell lymphomas [
31] and gastric carcinoma [
32]. The goal for the study presented here was to help resolve the lingering controversy pertaining to the presence of HCMV in GBM while at the same time providing a comprehensive and unbiased assessment of the viral genetic composition of brain tumor biopsies. This analysis failed to find convincing evidence for an association between HCMV or other known viruses and GBM or mengiomas. Nevertheless, we detected human papillomavirus (HPV) and hepatitis B in some low-grade gliomas (LGG). In addition, we expand on our previous reporting of potential contamination and/or interpretational artifacts that need to be considered in next generation sequencing based metagenomic and metatranscriptomic studies [
33,
34].
Discussion
Although there was an agreement reached from the HCMV/GBM symposium in 2011, emerging studies using NGS to assess the viral association with GBM has been unable to recapitulate this association [
9,
10,
14,
26,
27]. In line with these previous studies, our data further supports no direct viral association with GBM. There may be a possible association of HPV-16, HPV-58, and Hepatitis B with LGGs, however additional validation studies are required before any conclusions can be drawn from our initial assessment. Furthermore, based on the low abundance of viral reads that were identified in these cases, whether these viruses are truly associated with LGGs and not derived from sequencing contamination is unclear. Finally, although the viral detection threshold that we set for the RNA-seq datasets is relatively low (0.07 RPMH), all HCMV read findings were analyzed further irrespective of how low the HCMV read level and were found to be likely derived from laboratory plasmid contamination.
The hallmark of herpesviruses, and their key to persistence within their host, is their ability to switch to highly restricted gene expression patterns that allow avoidance of the immune system. To overcome the potential problem of missing viral infections due to this type of viral adaptation, WGS datasets were analyzed. Nevertheless, this approach also failed to identify any meaningful virus associations in the analyzed samples. This is in contrast to a report by Amirian et al. in which they identified HHV-6A and HHV-6B in the WGS datasets from TCGA [
5]. Another study conducted by Cimino et al. also identified HHV-6 and EBV DNA when they analyzed unmapped reads from a NGS-based comprehensive oncology panel [
9]. Although our initial investigation detected the presence of HHV-6 and HHV-7 viruses, further analysis of these viral reads revealed all reads consisted of human chromosomal telomeric-like repeats, TAACCC. Although HHV 6 and 7 have sequences homologous to this region, no other regions of the viral genome were represented in the sequence datasets. This is highly suggestive that these reads originated from the telomeric region of human chromosomes rather than representing
bona fide HHV6 or HHV7 infection.
EBV DNA reads were identified in a number of the TCGA DNA-seq datasets including 9 TP GBM WGS samples, 6 normal matched blood WGS samples plus 4 additional normal blood WGS samples, and 3 TR GBM WGS samples. In addition, we identified EBV DNA reads in 3 grade I meningioma samples and 4 normal blood samples. All EBV DNA reads identified were low in abundance with 1 – 39 reads detected in primary GBM samples, 1–5 reads detected in normal blood samples, and 1 – 15 reads detected in grade 1 meningiomas, a result similar to the findings of Cimino et al. in which they identified 1 – 18 EBV reads in 5 GBM samples [
9]. We identified 3 TR GBM samples using WGS datasets that were EBV positive, with 1 of these datasets showing moderate EBV levels (1454 viral reads), another showing minimal EBV levels (80 viral reads), and the last dataset had 1 EBV viral read. Although these three TR GBM WGS datasets were positive for EBV, the corresponding RNA-seq datasets for these samples failed to validate these findings. Without tissue to confirm these findings, it is impossible to determine the origin of these viral reads and we do not feel confident in associating EBV with these TR GBM samples. In addition, based on our past experience in the field of EBV, if EBV was truly associated, we would likely see greater than 10 viral RPMH for RNA-seq and thousands of viral reads for DNA-seq [
32,
53]. Finally, given the ubiquitous nature of EBV, the low viral read counts, and the presence of EBV in both tumor and blood samples in relatively equal proportions, we postulate that the EBV reads that were detected likely originated from EBV infected B-cells localized in the tumor stroma and/or from library or sequencing sample cross-contamination.
Due to the nature of GBM, there is a possibility for a preponderance of necrotic tissue within the tumor bulk, resulting in the effective dilution of tumor cells and tumor associated viruses; which could be argued as an explanation for the lack of strong viral detection. However, given the large number of samples analyzed and the careful procurement protocols utilized by TCGA, it is unlikely that the majority of samples fall within this scenario. Further supporting this contention, our analysis of the MRI-localized GBM biopsies from Gill et al. [
43] did not detect any known viruses and there were no differences between samples obtained from the core (presumably more necrotic) and those samples obtained from the tumor margin (presumably less necrotic, with active tumor growth and neoangiogenesis).
The identification of HPV-16, HPV-58 and HBV in a small portion of LGG RNA-seq datasets is a potentially interesting finding. Analysis of the clinical data from these patients using cBioPortal [
58,
59] demonstrated that the majority of virus positive samples were oligodendrogliomas (3 out of the 5 samples) from White males with an average age of 42 (Additional file
11: File S8). The demographics are relatively consistent with the whole LGG cohort (55 % males, 92 % White, and average age 43). Tumor type varied slightly from the whole cohort, which consisted of 193 astrocytomas (38), 130 oligoastrocytomas (25), and 191 oligodendrogliomas (37 %). In addition, although the genetic profile of these patients demonstrates a variety of alterations, some of the more common alterations observed in the entire cohort (e.g., IDH1, IDH2, ATRX, and TP53) were not observed in these patients with HPV or HBV reads (see reference [
39] for additional details regarding LGG samples). The lack of mutation of one or more of these in tumors with detected virus could be due to viral subversion of the corresponding pathways, obviating the need for somatic mutations (for example, through HPV E6 mediated inhibition of the p53 pathway). Nevertheless, further investigation into the association between viruses, HPV and HBV and LGGs is warranted.
Both HPV-16 and HPV-58 are considered high-risk HPV types, which are causative agents in the development of cervical carcinoma. The likely mechanism of action for both HPV-16 and HPV-58 is viral integration into the host genome [
60,
61]. Coverage analysis of the HPV positive LGG datasets indicate that some of the samples display evidence of integration with disruption of the viral E1 gene (Additional file
5: Figures S135-136) with all samples with HPV reads showing the majority of read coverage mapping to viral E6 and E7 oncogenes. Due to the low viral read numbers detected in our study, additional validation experiments are warranted to determine if there is truly an association between LGGs and HPV or whether these findings represent sample cross-contamination with true HPV associated samples.
Like HPV, the mechanism of action for HBV is also integration into the host genome. Visual analysis of the HBV positive LGG datasets demonstrated robust gene coverage within the HBVgp1/HBVgp2/HBVgp3 region with an abrupt termination of gene coverage after HBVgp3 (Additional file
5: Figure S137). Further, the majority of reads align within the HBVgp3 gene, which encodes the regulatory HBx protein. Previous studies have shown that HBx plays a critical role in the pathogenesis of hepatocellular carcinoma [
62,
63]. While this observation is also of potential interest, given the fact that adequate HBV reads were detected in only 1 sample out of 514 LGG datasets (0.19 %), further analysis is necessary to validate this observation.
The RNA CoMPASS analysis of the auxiliary brain tissue sequencing datasets provided a full metatranscriptomic profile including bacterial, fungal, and viral reads. Although we only presented data on the virome in this study, a complete metatranscriptomic analysis was performed. Although reads for several bacterial species were identified in the datasets, it has been our experience that the source of many of these reads are from environmental contamination [
33,
34,
64] and do not represent true associations.
Due to reports of an association between HCMV and GBM, immunotherapy treatments against HCMV were considered a logical next step as an exhilarating new avenue for cancer therapy. There are several clinical trials in the United States in various stages of completion focused on targeted HCMV therapy in GBM patients. While we await the results of these clinical trials, the results from the valganciclovir treatment of glioblastoma patients in Sweden (VIGAS) study, a randomized, double-blinded, placebo-controlled trial was recently published showing trends but no significant differences in tumor volumes between the valganciclovir (an anti-CMV drug) and placebo groups at 3 and 6 months [
65]. However, in a retrospective analysis of the same cohort with additional patients taking valganciclovir for compassionate reasons, the rate of survival of treated patients at 2 years was 62 % as compared with 18 % of contemporary controls with a similar disease stage, surgical-resection grade, and baseline treatment [
66]. Although these are remarkable results, questions have been raised as to the interpretation of the data and whether this survival rate is misleading [
67].
Acknowledgements
The authors would like to thank TCGA, all tissue donors, and the investigators for acquiring and sequencing the samples analyzed in this study. The authors would also like to thank BioServe Biotechnologies and Teresa A. Lehman and, Michael B. Seddon for providing brain tumor RNA samples used in this study. The bioinformatics analysis was carried out in the Tulane Cancer Center NGS Analysis Core.